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ABSTRACT 


The  design  of  a  decision  analysis  is  itself  a  complex  decision 
problem.  In  theory,  each  aspect  of  analysis,  encoding  the  probability 
density  functions  of  state  variables,  encoding  the  von  Neuman-Morgens tern 
utility  function,  and  compating  profit  lotteries  is  an  experiment.  The 
results  of  the  experiments,  the  data,  are  used  to  update  the  probabili¬ 
ties  in  the  primary  decision  problem.  The  economic  value  of  the  experi¬ 
ment  is  the  well  known  value  of  imperfect  information. 

The  drawback  to  the  theoretical  approach  is  that  the  data  are  func¬ 
tions.  Practical  methods  for  encoo’’ng  prior  distributions  over  functions 
do  not  exist.  Therefore,  the  traditional  approach  is  to  parameterize 
the  data. 

Our  approach  is  unique  because  we  show  that  for  an  interesting  class 
of  decision  problems,  arbitrary  parameterization  is  not  necessary.  The 
value  of  any  data  depends  probabilistically  only  on  the  prior  covari¬ 
ances  of  the  posterior  means.  For  independent  state  variables  this 
quantity  reduces  to  an  estimate  of  how  much  the  mean  of  a  probability 
density  function  will  shift  during  an  experiment. 

With  some  limitations  this  result  extends  to  the  local  risk  aver¬ 
sion  coefficient.  The  coefficient  can  be  treated  as  if  it  were  a  state 
variable.  Th*j  value  of  assessing  the  complete  utility  function  is  then 
proportional  to  the  prior  variance  jf  the  posterior  coefficient  Once 
again  encoding  the  potential  mean  shift  is  the  key  to  the  value  of  data 
generation. 

The  main  result  for  computation  is  logically  separate  from  the 
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previous  ones.  The  problem  is  to  find  the  optimal  quantization  for  a 
single  decision  variable.  Montt  Carlo  samples  from  the  profit  or  value 
function  can  be  generated  for  any  setting  of  the  decision  variable.  For 
a  fixed  total  sample  size  should  we  sample  many  times  at  a  few  decision 
points  or  a  fe«  times  at  many  decision  points?  The  answer  is  that  fine 
quantization,  implying  many  decision  settings,  is  always  superior. 
However,  the  expected  loss  from  rough  quantization  is  very  small. 

In  the  final  chapter  of  the  thesis  we  present  flow  charts  which 
show  how  our  results  can  be  applied  to  the  design  of  a  practical  deci¬ 
sion  analysis. 
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CHAPTER  1 


OVERVIEW 


1.0  Introduction 

The  design  of  a  decision  analysis  is  itself  a  complex  decisior 
problem.  This  dissertation  addresses  the  analyst's  decision  of  how  much 
computation  and  assessment  is  economically  justified  for  a  given  primary 
decision  problem.  The  results  are  in  two  areas.  In  the  first  part  of 
the  thesis  we  extend  decision  theory  to  cover  problems  that  can  be  ap¬ 
proximated  by  Taylor  series.  These  results  apply  to  latge  decision 
problems  where  complete  computation  is  infeasible.  In  the  second  part 
of  the  thesis  we  apply  the  results  to  the  specific  decisions  of  setting 
the  levels  of  assessment  and  computation  within  a  decision  analysis. 

The  practical  side  of  analysis,  problem  bounding  and  analytical  de¬ 
sign,  has  always  been  left  to  intuition.  To  handle  extremely  complex 
problems,  a  more  formal  approach  is  necessary.  The  analyst's  skill  at 
problem  formulation  will  never  be  eliminated,  but  the  approximate  tech¬ 
niques  developed  in  this  dissertation  should  allow  him  to  start  with  a 
very  general  representation  of  the  problem  and  rationally  eliminate  the 
unimportant  aspects. 

1 . 1  Decision  Analysis 

Decision  analysis  is  a  practical  discipline.  It  rests  on  twin 
foundations  of  decision  theory  and  systems  analysis.  The  reader  of  this 
dissertation  will  need  at  least  an  elementary  knowledge  of  Bayesian  de¬ 
cision  theory.  Excellent  introductions  to  subjective  probability  and 
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risk  preference  are  given  in  Howard  [2]  ind  Raiffa  [6].  Systems  theory 
allows  us  to  extend  rational  analysis  to  complex  problems.  The  reader 
should  understand  the  manipulation  of  matrices  and  optimization  of  func¬ 
tionals  . 

The  state  of  information  ic  a  fundamental  concept  in  decision  anal¬ 
ysis.  The  state  of  information  that  concerns  us  most  frequently  is  £  , 
the  decision  maker's  prior  knowledge  and  experience.  Another  familiar 
state  of  information  in  decision  analysis  is  clairvoyance  (C,£).  The 
clairvoyant  knows  the  exact  value  of  any  uncertain  variable.  In  this 
dissertation  we  will  normally  be  concerned  with  the  augmented  state  of 
information  (D,£).  If  the  data  D  contains  no  useful  information, 

(D,£)  reduces  to  £  ,  and  if  the  data  is  perfect  information  (D,£)  be¬ 
comes  (C,£). 

The  relationship  of  the  three  states  of  information  can  be  clari¬ 
fied  using  Howard's  [2]  decision  analysis  cycle.  In  Fig.  1.1  we 
associate  £  with  the  deterministic  phase,  (D,£)  with  the  probabilistic 
phase,  and  (C ,£)  with  the  informational  phase.  Calling  the  initial 
phase  deterministic  is  a  mild  misnomer  since  it  is  the  basis  for  pre¬ 
liminary  probabilistic  estimates.  The  probability  lensity  function  for 
a  state  variable  can  be  approximated  from  the  estimates  of  its  mem  and 
range.  The  profit  lottery,  the  probability  density  function  on  the 
value,  can  be  estimated  from  sensitivity  data  using  Taylor  series.  The 
deterministic  phase  provides  the  given  information  for  this  paper.  The 
probabilistic  phase  encompasses  the  encoding  and  computation  that  we 
wish  to  design.  The  informational  phase  is  only  of  interest  for  its 
role  in  the  three- part  analogy. 

Numbers  in  square  brackets  refer  to  List  of  References. 
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Related  Work 


We  model  computation  and  assessment  as  experiments.  Almost  all 
texts  on  decision  theory  present  one  or  more  special  cases  of  the  re¬ 
sults  in  Chapters  2  and  3. 

There  are  very  few  references  that  specifical  y  address  the  problem 
of  the  design  of  a  decision  analysis.  Howard  [2]  discusses  the  general 
philosophy.  He  points  out  that  finding  the  right  problem  is  as  impor¬ 
tant  as  solving  it.  Raiffa  and  Schlaifer  [7]  introduce  the  use  of  param¬ 
eters  of  probability  distributions  as  random  variables.  Matheson  [4] 
proposes  a  structure  which  is  very  similar  to  ours.  Specifically,  he 
introduces  the  concept  that  the  purpose  of  analysis  is  to  provide  data 
to  improve  the  state  of  information  in  the  primary  problem.  The  main 
difference  between  Matheson's  work  and  ours  is  that  we  use  an  approxi¬ 
mate  value  function  for  the  primary  problem.  The  approximation  dras¬ 
tically  reduces  the  required  input,  making  it  practical  for  application 
to  complex  decision  problems. 

Approximate  value  functions  based  on  Taylor  series  are  introduced 
in  Howard  [ 3 J .  Chapters  2  and  3  are  an  extension  of  Howard's  structure. 

1.2  Summary  of  Results 

The  thesis  begins  and  ends  with  examples  that  illustrate  the  appli¬ 
cation  of  our  theoretic il  results.  The  example  at  the  start  of  Chapter 
2  demonstrates  that  for  certain  problems  determinisitic ,  rather  than 
stochastic  sensitivities  are  sufficient  to  calculate  the  value  of  clair¬ 
voyance.  In  Chapter  li  we  return  to  the  same  example  to  illustrate  the 
application  of  the  results  from  Chapters  2,  3  and  4. 

In  Chapter  2  we  jolve  the  single-stage  risk- indifferent  decision 
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problem  which  has  many  decision  variables  related  to  many  state  vari¬ 
ables  through  a  second  order  value  model.  The  results  are  exact  for  a 
quadratic  value  function  and  approximate  for  a  complex  value  function 
that  can  be  expanded  in  a  Taylor  series  about  the  mean  of  the  state 
variables  and  the  deterministic  optimum  decision.  The  expression  for 
the  value  of  dita  can  be  decomposed  into  two  parts.  The  first  is  a 
matrix  representing  the  difference  between  closed  and  open  loop  sensi¬ 
tivities.  Closed  1 oop  implies  the  ability  to  optimize  the  decision 
variables  ?c r  the  state  variables  are  revealed.  The  second  is  the 
prior  covariance  of  posterior  means.  For  one  state  variable  this  quan¬ 
tity  reduces  to  the  prior  variance  of  the  posterior  mean,  a  single 
parameter.  This  is  a  tremendous  simplification  over  the  general  case 
in  which  the  value  of  data  depends  on  our  prior  estimate  of  the 
posterior  probability  distribution,  a  probability  distribution  over 
probability  distributions. 

In  Chapter  3  we  extend  the  results  of  Chapter  2  to  include  expo¬ 
nential  risk  aversion.  The  approximate  value  of  clairvoyance  derived 
in  Section  3.2  is  not  useful  for  calculations  because  it  involves  third 
and  fourth  covariances.  However,  by  considering  special  cases  of  the 
value  of  clairvoyance  we  derive  criteria  which  must  hold  for  the  re¬ 
sults  of  Chapter  2  to  be  valid.  Finally,  we  calculate  the  ss  from 
deliberate  suppression  of  risk  preference  and  the  gain  from  introducing 
the  decision  maker's  true  utility  function. 

In  Chapter  4  we  address  the  question  of  discretizing  a  decision 
variable  when  the  value  lottery  is  generated  by  Monte  Carlo  simulation. 
The  result  is  that  a  given  numo^r  of  random  samples  generates  slightly 
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less  expected  error  when  the  decision  variable  is  finely  discietized. 

Regardless  of  the  discretization  level,  the  expected  error  is  approxi¬ 
mately  proportional  to  the  total  number  of  samples. 

Chapter  5  is  best  summarized  by  Fig.  1.2.  Each  box  represents  a 
stage  in  the  design  of  a  decision  analysis.  Before  we  can  apply  our 
techniques,  we  need  preliminary  data.  Then  if  the  problem  is  suitable 
for  approximate  analysis,  we  consider  the  encoding,  risk  preference 
and  computational  decisions.  For  each  state  variable  the  encoding  deci¬ 
sion  is  whether  to  encode  a  complete  probability  density  function  or  to 
use  our  preliminary  estimate.  The  risk  preference  alternatives  are  to 
use  a  linear,  exponential  or  general  risk  preference  function.  The 
computational  alternatives  are  to  stop  after  the  preliminary  analysis 
or  to  continue  with  a  Monte  Carlo  simulation. 


Figure  1.2  Summary  of  the  economics  of  decision  analysis 


CHAPTER  2 


VALUE  OF  ANALYSIS  FOR  THE  RISK- INDIFFERENT  DECISION  MAKER 

2.0  Introduc  tion 

This  chapter  begins  with  a  simple  example  to  introduce  the  concept 
of  value  of  information.  For  the  quadratic  problem,  which  arises  in 
practice  when  the  value  function  can  be  approximated  by  a  second-order 
Taylor  series,  we  prove  a  general  theorem  for  the  value  of  data.  This 
theorem  is  extended  in  Chapter  3  and  mplied  in  Chapter  5.  At  the  end 
of  this  chapter  we  discuss  how  to  handle  non-quadratic  problems  and 
deliberate  errors. 

2. 1  Preliminaries 

In  this  section  we  introduce  inferential  notation  and  the  general 
terminology  required  to  describe  a  decision  problem. 

Notation 

Inferential  notation  is  well  suited  for  this  thesis  because  it 
explicitly  conditions  all  probabilities  on  a  state  of  information.  The 
probability  density  function  of  a  random  variable  >  conditioned  on 
the  state  of  information  G  is  denoted  by 

Mg]  •  (2.1.1) 

We  use  as  a  generalized  summation  operator;  thus  the  k*-'1  moment 

•<x 

of  x  is 

<Kk\$>  =  I'  xk{x|S}  (2.1.2) 

'^X 
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whether  x  is  continuous  or  discrete.  Inferential  notation  can  be 
nested.  For  example, 


{«ot|s2>|»1} 


(2.1.3) 


implies  that  the  mean  of  fx|S2)  is  a  random  variable  given  only  . 

In  addition  to  inferential  notation,  we  use  the  following  matrix 
symbols: 

The  underscored  lower  case  letter  denotes 


a  or  a^] 


a  column  vector  with  element  a^  . 


A  or  [a^] 

The  underscored  capital  letter  denotes  a 
square  matrix  with  element  a^j  . 

a'  or  A' 

The  prime  denotes  transposition. 

<a|3^  or  <a^  |8>j 

A  probabilistic  operation  is  applied  to 
each  component  of  a  vector. 

Basic  Decision  Problem 

The  deterministic  model  illustrated  in  Fig.  2.1  relates  the  three 
elements  of  the  basic  decision  problem.  The  decision  variables  d  are 
set  by  the  decision  maker.  The  state  variables  _s  are  set  by  nature. 
The  value  v  is  the  output  measure  that  we  want  to  maximize.  If  both 
s  and  d  are  known,  we  denote  the  decision  that  maximizes  the  value 
function  _d(s)  : 


d(s)  =  mgx  1v^s,d)  (2.1.4) 

However,  in  the  basic  decision  problem  illustrated  in  Fig.  2.1b, 
d  must  set  before  _s  is  observed.  The  possible  outcomes  are  described 
by  the  probability  density  function  [s|6]  »  where  C  is  the  state  of 
information  that  represents  the  decision  maker's  prior  knowledge  and 
experience . 


! 

We  assume  that  _s  is  independent  of  _d  in  the  sense  that 

{s|d,£}  =  {s|£}  .  (2.1.5) 

This  assumption  is  not  restrictive.  When  the  state  variables  are  de¬ 
pendent  on  the  decision  variables  the  problem  can  normally  be  reformu¬ 
lated  so  that  the  dependence  appears  in  the  value  function.  The  example 
in  the  following  section  illustrates  how  state  variables  can  be  made 
probabilistically  independent  of  decision  variables. 

The  basic  decision  problem  under  uncertainty  is  to  maximize  the 
expectation  of  v  : 

max  I  v(s,d)  [s|E]  (2.1.6) 

d  Jg 

The  expansion  rule  from  elementary  probability  theory  is 

<x|e>  =  I  <x|y,e>  {y|e}  .  (2.1.7) 

Jy 

Using  this  rule,  we  can  show  that  the  inferential  symbol  for  the  expec¬ 
tation  in  (2.1.6)  is  <v|_d,6>  : 

<vjd,£>  =  1  ^|s,d,£>  j_s|E}  (2.1.8) 

_s 

The  expectations  in  (2.1.6)  and  (2.1.8)  ar i  the  same  since  the  expect’d 
value  of  v  given  ^  and  d  is  de termini  Jtically  v(s,^)  . 

We  define  d(£)  as  the  decision  vector  that  maximizes  the  expected 
value  of  v  : 

j?(£)  *  max  ^  <v|d,£>  (2.1.9) 

If  S  represents  some  possible  future  state  of  information,  we  define 
d*(S)  as  the  intent  to  use  J(S)  when  becomes  available. 


The  Value  of  Info rma t io n 


Suppose  that  an  analysis  or  experiment  will  provide  some  data  D 
Then  (D,£)  represents  an  improved  state  of  information.  We  define 
the  expected  value  of  the  aata  *Vp|£>  : 

<vD|fi>«  <v  [d*(D,  £)  ,£>  -  <v  |d*(£)  ,£>  (2.1.10) 

Since  £  is  our  prior  information,  d(£)  is  known  and  thus 

<v|d*(£),£>  =  <v|d(£),e>  .  (2.1.11) 

The  first  term  in  (2.1.10)  is  the  key  to  the  value  of  data.  Given  the 
data  D  we  would  find 

d(D,£)  -  max" L^v |d  ,D,£>  ,  (2.1.12) 

_d 

which  would  result  in  the  posterior  expected  value  <v |5(D,£) ,D,£> 

Ho'  ever,  before  D  is  revealed  we  must  compute  the  prior  expectation  of 
this  quantity: 

<v  |d*(D  ,£)  ,D ,£>  =  <^/jd(D,£),D,£>|£>  (2.1.13) 

2.2  The  Entrepreneur's  Problem,  an  Example 

The  expected  value  of  data  is  a  very  useful  concept  in  applied  de¬ 
cision  analysis.  The  example  of  this  section  demonstrates  its  impor¬ 
tance.  In  the  remainder  of  the  chapter  we  examine  conditions  under 
which  the  example  can  be  generalized  to  more  complex  problems. 

The  example,  The  Entrepreneur's  Problem,  was  originally  formulated 
by  Howard  [3].  Our  methodology  differs  from  Howard's,  but  our  numerical 


results  are  the  same.  The  reader  need  not  be  familiar  with  [3]  to  under¬ 
stand  the  exairple. 

Description  of  the  Model 

The  Entrepreneur's  Problem  is  illustrated  by  the  schematic  tree  of 
Fig.  2.2a.  The  entrepreneur  must  decide  at  what  price  p  to  sell  his 
new  product.  His  profit  tt  is  the  revenue,  price  p  times  quantity 
sold  q  ,  minus  the  cost  c  .  Deterministically,  the  quantity  sold  is 
related  to  price  through  the  demand  curve  q (p)  .  The  total  cost  is 
related  to  the  quantity  sold  and  consequently  the  price  through  the  cost 
function  c(q(p))  . 

The  problem  is  simplified  by  assuming  that  given  the  prior  state  of 
information  £  ,  c  and  q  are  probabilistically  independent  of  p  and 
of  each  other.  The  quantity  Aq  is  defined  as  the  difference  between 
the  actual  demand  and  the  nominal  demand  q(p)  .  Likewise,  Ac  is  the 
difference  between  actual  and  nominal  cost.  The  independence  assumption 
implies  that 

{Aq,Ac|p,£}  *=  fAq|e}{Ac|e}  .  (2.2.1) 

Both  Ac  and  Aq  are  assumed  to  have  zero  mean: 

<Aq|e>*0  (2.2.2) 

<Ac|C>*0  (2.2.3) 

Using  these  simplifications  we  can  modify  the  deterministic  model 
as  shown  in  Fig.  2.2.b.  The  demand  and  cost  functions  are  incorporated 
into  the  model,  leaving  Aq  ,  Ac  and  p  as  the  input  variables.  The 
modified  model  is  an  example  of  the  basic  decision  problem  from  Sec¬ 
tion  2.1.  The  presentation  can  be  simplified  without  loss  of  general¬ 
ity  by  using  the  reduced  random  variables 


(a)  The  probabilistic  model 


(b)  The  simplified  deterministic  model 


Figure  2.2  The  Entrepreneur's  Problem 


qr  *  Aq /  <q\&> 


(2.2.4) 


c r  -  Ac/  |e>  .  (2.2.5) 

Deterministic  Data 

The  first  step  in  analyzing  the  Entrepreneur's  Problem  is  to  per¬ 
form  deterministic  sensitivities.  The  inputs  to  the  deterministic  model 
of  Fig.  2.2b  are  varied,  and  the  resulting  change  in  profit  is  observed. 
The  sensitivity  plots,  Figs.  2.3  through  2.6  serve  as  a  numerical  de¬ 
scription  of  the  problem. 

The  first  sensitivity  is  to  price.  In  Fig.  2.3  we  see  that  the 
deterministic  optimum  price  pQ  is  24.1  .  As  price  is  raised  or  low¬ 
ered  by  10,  the  profit  drops  from  198  to  14.  The  three  points  are  suf¬ 
ficient  to  determine  a  quadratic  approximation  to  the  price  sensitivity 
n(p)  : 

^P)  *  Ti<qr,cr,p)  (2.2.7) 

where 

qr  *  Cr  *  0  (2.2.8) 

More  compactly  we  express  this  sensitivity  as 

rr(p)  *  tt(0 ,0,p)  .  (2.2.9) 

The  sensitivities  to  quantity  are  shown  in  Fig.  2.4.  Once  again 
we  use  three  points  to  find  a  quadratic  approximation.  The  open  lorp 
sensitivity  is  performed  by  holding  cr  and  p  constant  while  qr 
varies: 

n0(qr)  =  ^qr>o,0o)  (2.2.10) 
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To  understand  the  generation  of  the  closed  loop  sensitivity  we  consider 

the  point  where  the  quantity  is  3  standard  deviations  above  the  mean. 

The  open  loop  profit  at  this  point  is  748,  an  increase  of  555  over  the 

c!r.  =  0  •  We  denote  the  open  loop  increase  Ar 

Turning  to  Fig.  2.5,  we  see  that  AtTo  corresponds  to  the  increase 

in  profit  with  the  decision  fixed  at  p  : 

ro 

AtTq  =  nO.O,^)  -  n(0,0,po)  (2.2.11) 

We  see  that  748  is  just  one  point  on  the  top  curve  in  Fig.  2.3, 

-t(3,0,P)  ,  the  price  sensitivity  for  =  3  .  The  maximum  of  this 
curve  is 

834  *  max  ^TO.O.p)  •  (2.2.12) 

Returning  to  Fig.  2.4,  we  see  that  834  is  also  the  value  of  the 

closed  loop  sensitivity  to  quantity  evaluated  at  q^  =  3  .  Therefore, 

the  closed  loop  sensitivity  to  quantity  is  the  change  in  profit  given 

the  opportunity  to  reoptimize  profit  after  the  quantity  is  revealed.  We 

see  from  either  Fig.  2.4  or  Fig.  2.5  that  the  closed  loop  change  A- 

c 

can  be  decomposed  into  the  open  loop  change  attq  plus  the  compensation 
^o  *  The  latter  term  is  c>ue  to  the  change  in  decision  Ap 

co 

The  unique  characteristic  of  the  cost  sensitivities  of  Fig.  2.6  is 
that  the  open  and  closed  loop  curves  coincide.  This  indicates  that  the 
decision  is  insensitive  to  cost.  In  effect  the  entrepreneur  has  written 
a  blank  check  to  his  creditors.  He  is  uncertain  about  the  differential 
cost  Ac  ,  but  he  cannot  influence  its  resolution. 
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Value  of  Information  for  the  Entrepreneur’s  Problem 

We  now  use  the  problem  formulation  and  the  sensitivity  data  to  solve 
the  Entrepreneur's  Problem  and  to  calculate  the  value  cf  clairvoyance  on 
the  state  variables.  As  we  shall  see  in  the  following  section  these 
calculations  are  exact  for  a  quadratic  problem.  The  value  of  clair¬ 
voyance  is  interesting  because  it  is  prototypal  of  the  approximate 
value  of  data  for  more  complex  problems. 

Finding  the  prior  optimum  decision  p(£)  is  straightforward.  The 
entrepreneur's  profit  function  is  quadratic  in  the  state  and  decision 
variables.  As  we  shad  see  in  the  next  section  this  implies  that  the 
deterministic  and  probabilistic  optimum  decision  coincide: 


P(C)  -  P\ 


(2.2. 13) 


To  find  the  value  of  clairvoyance  on  the  state  variables,  we 
specialize  (2.1.10)  to 


^  |e>=  <n|P*(c,e),e>  -  <-|p(£),e>  . 


(2.2.14) 


Since  clairvoyance  (C,£)  is  equivalent  to  exact  knowledge  of  q(.  anti 
c^  we  can  expand  (2.2.14)  to 

<VC|C>**  <<TT|p(qr,cr)  ,qr,cr,S>  -  *'1t|p(£)  »<Vcr»S>l"'>  *  (2.2.15) 

The  inner  expression  of  (2.2.15)  can  be  evaluated  from  sensitivity  data. 
Since  changes  in  cost  do  not  affect  price,  the  first  term  reduces  to 

<rr)p(qr)  «qr*cr,^>  =  ^qr .^r .p(qr))  =  n(qr,0,p(qr))  -  c  .  (2.2.  Lb) 


The  decomposition  of  the  profit  is  possible  because  the  cost  is  addL  Lve, 
Likewise,  the  second  tv  rm  is 
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<rrjp(S)  ,qr  ,cr,e>  =  -n(qrJcr,po)  =  TT(qr,O,0o)  -  &  . 


(2.2.17) 


Subtracting  (2.2.17)  from  (2.2. '6)  we  find  that  the  inner  term  is  the 

compensation  Att  (q  )  ,  the  difference  between  n  (q  )  and  n  (q  ) 
co  r  c  r  o  r 

in  Fig.  2.4.  Therefore,  the  value  of  clairvoyance  is  the  expected 
value  of  compensation: 

<Vc|£>=  <TT(qr,0,p(qr))  -  n(qr,0,po>  |C>  =  <ATTco  '  qr )  |  G>  (2.2.18) 

Figure  2.7  illustrates  the  expected  value  of  compensation: 

<Tto(qr)le>*  !  ^r|e}  ATTco(qr)  (2.2.19) 

'qr 

Using  the  data  from  Fig.  2.4  the  compensation  is 

Anco(qr)  =  9.5  qr2  .  (2.2.20) 

Substituting  (2.2.20)  into  (2.2.19)  and  recalling  that  the  reduced 
variable  has  zero  mean  and  unit  variance,  the  value  of  clairvoyance  is 

<vje>=  9.5  f  [q  |e)  qr2  =  9.5  .  (2.2.21) 

'qr 

The  value  of  clairvoyance  is  the  curvature  of  the  closed  loop  sensitiv¬ 
ity  to  the  reduced  state  variable  at  qr  =  0  .  In  the  next  section  we  show 
that  the  curvature  of  the  closed  loop  sensitivity  is  important  regardless 
of  the  source  of  the  data. 

2. 3  The  Value  of  Data  for  a  Decision  Problem  with  a  Quadratic  Value 
Function 

In  this  section  we  derive  the  exact  value  of  data  for  a  quadratic 
value  function.  After  we  state  the  result  and  prove  it,  we  suggest  how 
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tion 

_) 


to  apply  it  to  non-quadratic  problems.  The  theorem  of  this  section  is 
extended  in  Chapter  3  and  applied  in  Chapter  5. 

The  deterministic  model  is  illustrated  in  Fig.  2.8.  The  value 
function  v(s,d)  is  quadratic  in  the  state  vector  _s  and  the  deci¬ 
sion  vector  d  .  We  will  normalize  the  state  variables  to  have  zero 
mean,  and  the  decision  variables  to  be  zero  at  the  deterministic  maximum: 


<s|f'>=0  (2.3.1) 

5  =  max  1v(<a |E>,d)  =0  (2.3.2) 

°  d 

These  assumptions  reduce  alge\raic  complexity  without  sacrificing 
generality. 

We  write  the  quadratic  value  function  as 


v(s,d)  =  a  +  b's  +  c'd  +  s’E  1  +  s'G  J  +  \  d’H  d  .  (2.3.3) 


The  second-order  necessary  and  sufficient  conditions  for  v(s,d)  to 

have  a  maximum  at  <s|f>  and  dQ  are  that  the  gradient  of  v  with 

respect  to  d  w(*3>  |6>,do)  be  zero  and  that  the  Hessian  of  v  with 

2 

respect  to  dm  v<s|S>,d)  be  negative  definite.  Using  (2.3.1)  and 

I  A 

(2.3.2)  the  gradient  and  Hessian  at  <s  and  d  are  defined  as 

—  i  — o 

W(^|e>,d  )  =  (2.3.4) 

o  ^d^ 

I _ I 


V2v(^S  jp>,d  )  =  [-Lv.v9»£>  I  . 

u  L  -vj  J 


(2.3.5) 


M .  hd . 
i  J 


Applying  (2.3.4)  and  (2.3.5)  to  the  definition  of  v(s,d)  (2.3.3),  we 
have: 
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Figure  2. 


vCi»^)  “  a  +  b'  s  +  _c*  _d 

+  s'  E  s  +  s'  G  d  +  ~  d'  H  d 

,  C 

d 


Transpose  of  a  matrix 
Constant 
Constant  vectors 
State  variable  vector 
Decision  variable  vector 
Constant  square  matrices 


The  quadratic  value  model 
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(2.3.6) 


V  v(0 ,0)  =  c'  +  O'H 

V2v(0,0)  =  H  (2.3.7) 

Since  the  gradient  in  (2.3.6)  must  be  the  zero  vector,  our  assumptions 
imply  that  c  must  also  be  the  zero  vector.  From  (2.3.7)  we  see  that 
the  Hessian  does  not  vary  with  _s  and  _d  for  the  quadratic.  There- 

A 

fore  if  the  deterministic  optimum  _dQ  exists,  H  is  negative  definite 
and  the  value  function  has  a  global  maximum  with  respect  to  _d  for  any 
state  vector  s^  . 

Chronologically,  we  receive  the  data  about  the  state  variables. 

Then  we  set  the  decision  vector,  and  finally  nature  sets  the  state  vari¬ 
ables.  The  state  variables  are  independent  of  the  decision  variables 
but  not  necessarily  independent  of  each  other.  We  assume  that  the  deci¬ 
sion  maker  is  risk- indifferent  so  that  maximizing  the  vclue  function  is 
equivalent  to  maximizing  the  decision  maker's  von  Neumann-Morgenstern 
utility  function. 

With  these  preliminaries  we  can  state  the  theorem: 

THEOREM:  For  the  quadratic  value  function 

v(s,d)  =  a  +  b's  +  ~  s'E  s  +  s'G  d  +  \  d'H  d  ,  (2.3.8) 

where  the  Hessian  H  is  negative  definite,  the  value  of  any  data  D  is 

^D|e>=  -  ^<<s|D,e>'  g  h-1g'  <^iD,e>|e>  .  (2.3.9) 

PROOF:  From  (2.1.10)  the  value  of  the  data  is 

<vDls>=  v[d*(D,e),D,e>  -  -v|d*(e),e>  .  (2.3.10) 
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The  proof  is  in  two  parts  corresponding  to  the  two  terms  of  (2.3.10). 
First  we  determine  the  prior  maximum  expected  value  <v  |d*(f )  ,£>  ; 
then  we  determine  the  expected  value  given  the  opportunity  to  maximize 
after  the  data  is  received  <v[d*(D,£) ,D,£>  . 

To  find  <v|d*(£),£>  we  start  with  the  prior  expected  value 
<v[d,£>  .  Recalling  that  the  expected  values  of  the  state  variables 
are  all  zero,  the  prior  expectation  of  (2.3.8)  is 

<v|d,£>  =  a  +  j  <^'E  s|£^  +  d'H  d  .  (2.3.11) 

The  first-order  necessary  condition  for  <v|d(£),£>  to  be  an  uncon- 

a 

strained  maximum  is  that  the  gradient  be  zero  at  d(£)  : 

V  <v|J(£),£>  ■=  O'  (2.3.12) 

Taking  the  gradient  of  (2.3.11)  and  setting  it  to  zero,  we  have 

d'(£)H-0'  .  (2.3.13) 

A 

Since  H  is  negative  definite,  d(£)  must  be  the  zero  vector.  There¬ 
fore  (2.3.11)  becomes 

<v|d(£),£>  -  a  +  <S'E  sj£>  .  (2.3.14) 

Returning  to  the  first  term  in  (2.3.10),  the  expected  value  given 
data  D  is 

<v |d,D,£>  =  a  +  b' |D,C>  +  <3 'E  _s  |D,£>  +  >3  |D,£>'G  d 

d'H  d  .  (2.3.15) 

Maximizing  (2.3.15)  with  respect  to  d  we  have 
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(2.3. 16) 


V  <v]d(D,£),D,£>  =  <s |D,£>'G  +  d'H  -  O'  . 

Equation  (2.3.16)  implies  that 

d(D,£)  =  -  H"  1G'  *3|d,£>  .  (2.3.17) 

Substituting  (2.3.17)  into  (2.3.15),  we  have 

<v|d(D,£)  ,D,C>  =  a  +  b'<slD,E>  +  <s'E  s |D,£> 

-<b|D,E>'G  H_1G'  -s|d,£>+-^  <3|D,e>'G  H_1G  <s  Ip,?,  ' 

=  a  +  b '  <s  |  D ,  £  >  +  <s 'E  s  |D,F> 

-  j  <s|D,£>'G  H"lG'  <s|D,£>  .  (2.3.18) 

Recalling  (2.1.13),  the  next  step  is  to  take  the  prior  expectation 
of  (2.3.18).  We  shall  consider  each  term  separately.  Of  course,  expec¬ 
tation  does  not  affect  the  value  of  the  constant  a  .  The  prior  expecta¬ 
tion  of  the  posterior  mean  is  the  prior  mean  : 

<<2>|D,£-l£>  *  -"s|£>  (2.3. 19) 

Equation  (2.3.19)  is  a  direct  application  of  the  definition  of  conditional 
probability.  Likewise,  the  third  term  becomes 

<<3'E  _s  |D,£  ,|£>  =  -s'E  s  IF'*  .  (2.3.20) 

Applying  the^e  results  to  (2.3.18),  we  have 

<v|d*(D,£)  ,£>  =  a  + -|  ^'E  s|r>  -  <<s|D,F'>'G  H_1G'  •s|D,rN|<*>  . 

(2.3.21) 

Finally,  subtracting  (2.3.14)  from  (2.3.21)  the  result  is 
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<Vd|C>  -  -  jfV  ^|D,e>|e  >  .  Q.E.D.  (2.3.22) 


Special  Cases  of  the  Theorem  That  Appear  in  the  Literature 

Three  special  cases  of  the  theorem  (2.3.9)  appear  in  the  litera¬ 
ture.  Howard  [3,  p.  518]  treats  the  case  where  H  is  diagonal  and  the 
data  D  is  clairvoyance.  DeGroot  [1,  p.  234]  solves  for  d  ,  the  esti¬ 
mate  of  the  random  variable  _s  which  minimizes  a  quadratic  loss  function 
In  our  notation  his  problem  is  the  case  where 


E  -  -  H  ;  (2.3.23) 

a,  b,  and  G  are  zero;  and  D  is  clairvoyance.  Raiffa  and  Schlaifer 
[7,  p.  188]  present  the  one- dimensional  estimation  problem  without  re¬ 
quiring  the  data  to  be  clairvoyance. 


2.4  Discussion  of  the  Value  of  Data  for  the  Quadratic  Problem 
An  alternate  expression  for  the  theorem  (2.3.9)  is: 


1 

2 


trace  E  C., 
— co-D 


(2.4.1) 


where 

Eco  *  G  H'V  (2.4.2) 

-D  "  L<<C5ilD»e>  <5j|D,e>|e>  '  (2.4.3) 

The  trace  of  a  matrix  is  the  sum  of  its  diagonal  elements.  The  value  of 

data  has  two  major  components.  The  basic  decision  problem  is  specified 

by  E  ,  and  the  experiment  is  described  by  . 

—co  J  — D 

We  consider  Ecq  and  CD  briefly  for  the  general  case.  Then  for 
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the  case  that  is  most  common  we  discuss  how  Ecq  and  CD  could  be  gen¬ 
erated.  Finally,  we  return  to  the  Entrepreneur's  Problem  to  illustrate 
the  encoding  of  CD  . 

Eco  follows  directly  from  (2.4.2)  for  a  true  quadratic  value  func¬ 
tion  since  the  matrices  G  and  H  are  specified.  For  a  problem  that 
is  approximately  quadratic,  G  and  H  can  be  found  by  expanding 
v(s,d)  in  a  Taylor  series  about  the  point  (<s|F>,  d(C))  =  (0,0)  : 


v(s,d)  -v(0.0,  +|J-  s+i  sN-^-js 

, _ 1  j- 

+  ^brsr]^  +  ^'LSd4:]^ 

i  j  i  j 


(2.4.4) 


The  partial  derivatives  are  all  evaluated  at  the  point  (0,0)  .  Compar¬ 
ing  (2.4.4)  with  (2.3.8),  we  see  that  G  and  J1  must  be  matrices  of 


partial  derivatives: 


(2.4.5) 


-«-b4] 


(2.4.6) 


The  partial  derivatives  at  the  operating  point  (0,0)  can  be  approximated 
from  open  loop  sensitivities.  One  joint  sensitivity  is  required  for 
each  possible  pair  of  state  and  decision  variables  and  for  each  possible 
pair  of  decision  variables. 

The  elements  of  the  matrix  Cp  are  the  expected  product  of  the 
posterior  means.  Since  the  prior  expectation  of  the  posterior  mean  is 
zero,  the  elements  are  the  covariances  of  the  posterior  means. 

When  the  data  is  clairvoyance  on  the  state  variables  ^  ,  (2.4.1) 
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reduces  to 


<^c  |  e>  =  -  ^  trace  EcoC  (2.4.7) 

If  we  consider  the  posterior  means  <s|D,C>  *s  random  variables,  com¬ 
parison  of  (2.4.7)  and  (2.4.1)  implies  that  the  value  of  data  is  the 
value  of  clairvoyance  on  the  posterior  means.  In  most  practical  prob¬ 
lems  the  value  of  clairvoyance  on  the  posterior  mean  is  much  easier  to 
compute  than  the  value  of  clairvoyance  on  the  data  itself. 

An  Interesting  Special  Case 

The  most  interesting  special  case  occurs  when  either  E  or 

—co  — D 

is  a  diagonal  matrix.  Then  the  value  of  data  becomes 


i 

where  the  vector  is  the  ic^  row  of  G  : 


(2.4.8) 


(2.4.9) 


If  the  state  variables  are  independent  (2.4.8)  is  exactly  equal  to 
(2.4.1).  Sufficient  conditions  for  (2.4.8)  to  be  a  good  approximation 
to  (2.4.1)  are  that  the  diagonal  elements  dominate  the  off-diagonal  ele¬ 


ments  of  E  :  that  is  for  each  i  and  i  : 

— co  J 


2  if  H'i.) 

PU  ....  „-l  .2 


(2.4.10) 


Cai  H  £j)‘ 


where  p  is  the  correlation  coefficient 


Pij  =  <<s.|d^>  <s ,|d£>|  *>/(/<;  <s.|D,s>|e>  <:  <Sj(D,e->|?>  .  (2. 


4.11) 


Given  G  ,  H  ,  and  Cp  ,  these  expressions  tell  us  when  the  diagonal 
assumption  holds.  A  more  interesting  question  is  whether  we  can  avoid 
generating  the  entire  matrices  G  ,  H  and  CD  .  The  answer  is  yes, 
as  shown  below. 


Description  of  the  Primary  Problem  Using  Closed  Loop  Sensitivities 

We  now  show  that  the  term  H_1gt  is  the  curvature  of  compensa¬ 
tion  of  v  with  respect  to  the  ith  state  variable  : 


S2v  (s.) 

co  i' 


ds . 


(2.4.12) 


where 


vco<Si>  *  Vc(Si>  •  Vo<si> 


(2.4.13) 


As  we  saw  in  Fig.  2.4,  the  open  loop  sensitivity  is  evaluated  by 
varying  si  while  the  other  state  variables  and  the  decision  variables 
remain  constant.  We  denote  the  open  loop  sensitivity  as 


V  (S.)  -  v(°,0,...,s . ,0,d  )  .  (2.4.14) 

In  closed  loop  sensitivity  the  state  variables  other  than  s.  remain 

l 

fixed,  but  the  decision  is  reoptimized  for  each  s^^  : 

vc(si)  —  v (0 >0 > • • • » s ^ , . . . , 0 , d (0 ,0 , . . . , s ^ , . . . , 0) )  (2.4.15) 


To  show  that  expression  (2.4.12)  is  valid  we  evaluate  vn(s,)  and 
vc(si)  for  the  quadratic  value  function  (2.3.8)  : 


o  1' 


1  2 

v  (s.)  ■  a  +  b.s.  +  —  e^s, 
o'  i'  i  i  2  ii  i 


(2.4.16) 
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W  =  ”f  <“  +  bisl  +  i  ^ti^2  +  \  d  H  d) 

-  a  +  i  eljSl2  -  .1  £|h'  J^2  (2.4.17) 

Subtracting  (2.4  16)  from  (2.4.17)  the  compensation  is 

Vco(Si)  =  "  \  si2  •  (2.4.18'* 

Therefore,  by  evaluating  the  curvatures  of  the  compensation  curves  lor 
the  state  variables,  the  need  to  find  the  matrices  of  partial  derivatives 
G  and  H  is  eliminated. 

The  Description  of  the  Data  Generating  Process  Through  Preposterior 
Moments 

The  second  component  of  (2.4.8)  is  V  <<s .  |D ,£>|  f£>  the  prior 
variance  of  the  posterior  mean.  To  evaluate  this  term  we  use  the 
theorem  : 

v<s  |e>  =  <c<s  (D,e^>(e>  +  <^s|D,e>|e>  (2.4.19) 

A  proof  of  this  theorem  is  given  in  Raiffa  and  Schlaifer  [7,  p.  106]. 

The  theorem  states  that  the  prior  variance  ^s  |C>  has  two  sources.  The 
expected  posterior  variance  <Ys  |D,C>|f  >  is  a  residual  variance  which 
will  not  be  resolved  by  the  experiment  that  generates  the  data  D  . 

The  prior  variance  of  the  posterior  mean  <C<s  |D,£>|e>  is  the  portion 
of  the  prior  variance  that  will  be  resolved  by  the  experiment. 

Sample  Data 

Expression  (2.4.19)  is  best  known  for  the  case  where  data  are  N 
random  samples  from  [s|G]  .  First  we  consider  the  limiting  cases  of  no 
samples  and  of  infinite  samples.  Then  we  consider  a  finite  number  of 
samples . 
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When  the  data  is  the  null  experiment,  N  *  0  ,  the  prior  and  poste¬ 
rior  states  of  information  coincide.  Therefore,  we  have 

<<s  |D,e>|e>  =  <^s|e>|e>  =  h>\e>  (2.4.20 

\<s  |D,«>|p>  =  <C<s)^>|f>  *  0  .  (2.4.21 

When  the  number  of  samples  approaches  infinity,  the  data  is  clair¬ 
voyance  about  s  .  The  posterior  probability  density  function  will  have 
all  of  its  mass  at  a  single  point.  Consequently,  the  preposterior  mo¬ 
ments  are 


<%s|D,e>je>  =  -o|e>=  0  (2.4. 

^<5|D,e>|e>  =  ^s|£>  =  ^|e>  .  t;2.4. 

To  discuss  (2.4.19)  for  finite  N  it  is  convenient  to  define  the 
ratio  r 


r  =  V<<^s  |D,e  ^|f>  /  Vs|e>  .  (2.4.24 

The  limiting  cases  are  r  =  0  for  the  null  experiment  and  r  =  1  for 
c  lairvoyance . 

A  Bayesian  must  assign  both  r  and  {s|P.}  before  he  can  calculate 
the  expected  value  of  sample  information.  For  example,  Raiffa  and 
Schlaifer  [7,  p.  110]  suggest  assigning  an  equivalent  sample  size  N' 

to  the  term  <-s |D,P>j?>  .  Then  for  certain  conditions  the  parameter 
r  is 


N'  +  N 


(2.4.25) 
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Assigning  either  r  or  N'  weights  the  prior  information  relative  to 
the  sample  information. 

Experiments  That  Do  Not  Involve  Sampling 

Encoding  and  modeling  are  analogous  to  sampling  because  they  par¬ 
tially  resolve  uncertainty  about  the  state  variable  s  .  Encoding  the 
parameter  r  or  equivalently  ^<s  |D,C>|e>  should  be  no  more  difficult 
for  these  cases  than  for  sampling. 

Encoding  for  the  Entrepreneur's  Problem 

Consider  the  demand  in  the  Entrepreneur's  Problem.  One  possible 
experiment  to  reduce  uncertainty  is  to  improve  the  deterministic  model. 

A  second  possibility  is  to  encode  [q  | }  more  completely. 

In  the  first  case  suppose  the  entrepreneur  wants  to  know  whether 
it  is  worthwhile  to  divide  the  market  into  sectors  and  to  study  histori¬ 
cal  data  about  consumer  response  to  price  changes.  To  evaluate  the  model 
improvement  we  encode  what  the  new  prediction  might  be  at  the  price 
p^  =  24.1  .  We  ask  questions  like  would  you  rather  bet  that  a  fair  coin 
comes  up  heads  on  the  next  toss  or  bet  that  the  new  prediction  will  be 
within  10  percent  of  the  original  one.  A  series  of  such  questions  reveals 
that  the  entrepreneur's  probability  density  function  on  the  mean  shift  is 
normal  with  the  mean  equal  to  the  previous  estimate  of  58.5  and  the  vari¬ 
ance  equal  to  10.  Since  Aq  was  previously  defined  as  the  difference 
between  the  predicted  and  actual  demand  we  assign 

<¥q|D,£>|e>  «y.q|C>=  100  (2.4.26) 

^<qlD,F>|e>  *  10  .  (2.4.27) 

Consequently,  using  (2.4.18),  the  variance  of  demand  is 
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| £>  *  110  . 


(2.4.28) 


From  Section  2.2  we  recall  that  the  curvature  of  the  compensation  is 

d2nco(q) 

Eco  ' - 1 -  -  0.095.  (2.4.29) 

dq 

The  expected  value  of  the  modeling  data  D  is  found  by  substituting 
(2.3.27)  and  (2.4.29)  into  (2.4.1)  : 

<vD|£>-  (0.095) (10)  -  0.95  .  (2.4.30) 

Now  suppose  that  the  entrepreneur  has  upgraded  his  model,  but  he 
has  left  one  free  parameter,  q  the  demand  at  pQ  =  24.1  .  He  feels 
that  if  he  knew  q  he  would  have  complete  confidence  in  his  model. 

His  prior  on  q  has  moments 

<q |£>  -  58.  5  (2.4.31) 

^|e>  =  100  .  (2.4.32) 

Upon  questioning,  the  entrepreneur  reveals  that  a  contributing  factor 
to  his  uncertainty  is  personal  ignorance.  If  he  had  the  opportunity 
to  incorporate  his  staff's  expertise  he  is  confident  that  fq  |c} 
would  change.  After  further  questioning  he  decides  that  there  is 
a  50  percent  chance  that  he  could  change  q  by  more  than  5  units  after 
learning  his  staff's  opinion.  When  we  point  out  that  this  implies  an 
expected  posteiior  variance  of  50,  the  entrepreneur  says,  "That  sounds 
reasonable."  Therefore,  to  compute  the  value  of  encoding  q  we  assign 

Xq|e>  =  100  (2.4.33) 

<^q|D,e>|e>  =  50  (2.4.34) 
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V<<q|D,e>|e>  -  50  . 


(2.4.35) 


Consequently,  using  (2.4.29)  and  (2.4.35)  in  (2.4.1),  the  value  of  encod¬ 
ing  is 

<vlD,£>«  (0.095) (50)  -  4.8  .  (2.4.36) 


2. 5  The  Value  of  Data  for  Non-Quadratic  Decision  Problems 

The  purpose  of  this  section  is  to  discuss  how  the  theorem  of  Section 
2.3  can  be  extended  to  non-quadratic  problems.  The  result  is  a  practical 
procedure  for  ranking  the  state  variables.  Given  certain  non- restrictive 
conditions  the  ranking  scheme  applies  to  any  single-stage  decision  problem, 
regardless  of  whether  the  decision  and  state  variables  are  continuous  or 
discrete. 

The  Discrete  Decision 

Before  we  turn  to  the  general  evaluation  scheme,  we  consider  the 
discrete  problem.  With  presubscripts  denoting  the  decision  alternative 
the  value  function  is 


v(s,d) 


^a  +  jb'_s  +  s'  ^E  s  +  ... 


.a  +  .b'  1  s'  _E  s  +  . . . 
Z  2-  Z  —  2-  — 


d-d, 


d-d. 


(2.5.1) 


There  are  only  two  possible  decisions,  d^  and  . 

We  would  like  to  find  an  expression  for  the  value  of  data  similar 
to  the  one  derived  for  the  quadratic  problem  in  Section  2.3.  The  expres¬ 
sion  should  depend  on  deterministic  sensitivity  data  and  the  prior  dis¬ 
tribution  of  the  posterior  mean. 

The  key  factor  in  the  quadratic  problem  is  that  optimizing  the  value 
function  evaluated  at  the  mean  of  the  state  variables  is  equivalent  to 
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optimizing  the  expected  value  : 


max”1  v(<s|D,e>,  d)  =  mgx’ l<v  |d,D,e>  (2.5.2) 

Without  this  property  we  cannot  have  the  simplification  that  the  closed 
loop  stochastic  sensitivities  can  be  replaced  by  deterministic  sensitiv¬ 
ities: 


Vc0(si)  *  <v|d(s1,e>,si,e>  -  <v|3(e),e>  (2.5.3) 

It  is  straightforward  to  show  that  (2.5.2)  and  consequently  (2.5.3) 
hold  for  the  discrete  problem  only  if  the  value  function  is  linear  : 

fia  +  ii  -s  lt  <>i 

v(s,d)  -  J  (2.5.4) 

ba+2ii  if  *-<i2 

For  this  case  consider  data  Di  that  inpacts  only  s1  : 

<sj|D.,e>*  <Sj|e>  j  t  i  (2.5.5) 

We  can  show  that  the  expected  value  of  data  Di  is  the  expected  compen¬ 
sation  for  s.  : 

l 

^Di  lC>  =  <^co(si)le>  <2.5.6) 

Figure  2.9  shows  the  open  and  closed  loop  deterministic  sensitivi¬ 
ties  for  the  linear  quadratic  problem.  The  terms  a  and  b  which  com¬ 
pletely  specify  the  closed  loop  sensitivity  for  the  discrete  case  did  not 
even  appear  in  the  continuous  sensitivities  discussed  in  Sections  2.2 
and  2.4. 

We  notice  that  the  difference  between  the  value  of  data  for  the 
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two- act ion  problem 


discrete  and  continuous  cases  is  contained  in  the  differences  between 
the  sensitivity  plots.  Expression  (2.5.6)  holds  for  both  cases.  This 
similarity  suggests  the  following  practical  procedure  for  ranki’  >  state 
variables  in  a  decision  problem: 

Step  1:  Plot  the  deterministic  open  and  closed  loop  sensitivities 
for  each  state  variable. 

Step  2:  Calculate  the  expected  value  of  compensation  for  each 
variable . 

Discussion  of  the  Ranking  Scheme 

Certain  conditions  must  hold  for  the  ranking  to  be  accurate.  First, 
the  condition  (2.5.3)  that  the  stochastic  compensation  is  approximately 
the  deterministic  compensation  must  hold.  Second,  the  state  variables 
must  be  effectively  independent  in  the  sense  of  (2.4.10)  and  (2.5.5). 
Third,  if  the  data  is  not  clairvoyance,  the  r  coefficient  defined  in 
(2.4.24)  must  be  the  same  for  each  variable,  making  the  value  of  data 
proportional  to  the  value  of  clairvoyance.  For  most  practical  problems 
this  is  not  a  restrictive  set  of  assumptions. 

An  approximate  method  for  performing  Step  1  for  the  quadratic 
problem  is  suggested  in  the  Entrepreneur's  example  of  Section  2.2.  For 
the  discrete  problem  with  many  possible  decisions  the  closed  loop  sensi¬ 
tivity  can  be  found  by  plotting  all  of  the  open  loop  sensitivities  and 
taking  the  maximum  as  a  function  of  s.^  .  This  technique  is  illustrated 
for  the  two  option  case  in  Fig.  2.9.  Problems  with  both  discrete  and  con¬ 
tinuous  decision  variables  will  have  more  complex  compensation  plots 
than  either  the  continuous  or  discrete  cases. 

If  the  compensation  plot  is  approximately  quadratic  Step  2  is  per¬ 
formed  by  finding  the  curvature  of  the  compensation  plot  and  multiplying 


by  the  variance  of  s 


If  the  compensation  plot  has  the  form  of 


i  * 


Fig.  2.9,  the  expected  compensation  is  : 


<yDi|e> 


where 


2bi-  lbtl 

L(r)  (.,) 

2V  lbl  1 

(s.) 

la  - 

23 

(2.5.7) 


Sb  "  „b.-  b. 


2  i  l“i 


(2.5.8) 


and 


L(f)  (s.) 


d<si|Di,R>  (<s.|Dt,e>  -  sb)  f<sL |D1,K>|e}  (2.5.9) 


LU)  (s.) 


d  <si|D.J^.>  (sb  -  <si|Di,e>)  [<s.|D.,e>|e]  (2.5.10) 


The  linear  loss  integrals  (2.5.9)  and  (2.5.10)  are  tabulated  functions 
for  the  normal  distribution.  It  is  straightforward  to  compute  the  lin¬ 
ear  loss  integrals  for  the  uniform  and  the  triangular  distributions. 
When  either  the  sensitivity  plot  or  the  probability  density  function 
has  a  complex  functional  form,  numerical  integration  is  required  to  per 
form  Step  2. 


2.6  Deliberate  Introduction  of  Error 

We  deliberately  introduce  errors  into  a  decision  analysis  if  the 
resulting  computational  savings  exceed  the  expected  loss.  In  this  sec¬ 
tion  we  show  that  the  expression  fjr  loss  from  using  the  approximate 
probability  density  function  [s|C}a  instead  of  the  accurate  one  {s|e} 
for  the  quadratic  problem  is  t  : 


where 


1 

-z  trace  E  S 
2  —co  — 


(2.6.1) 


(2.6.2) 


This  is  the  same  as  the  value  of  data  (2.4.1)  with  the  covariance  matrix 
i-Q  replaced  by  _S  ,  the  matrix  of  products  of  the  approximate  means. 

S  is  the  zero  matrix  for  the  accurate  probability  density  function 
(s|e)  because  the  state  variables  are  normalized  to  have  zero  mean: 


<s|£>  =  0 


(2.6. 3> 


To  derive  (2.6.1)  we  must  distinguish  between  two  types  of  error. 
The  total  change  in  expected  value  AvE  is  the  difference  between  the 
maximum  expected  values  based  on  the  correct  and  approximate  probability 
density  functions  : 


=  <v|d\c),e>-  <v|cf(e)a,e>a 


(2.6.4) 


By  adding  and  subtracting  <v|d(^)a,£>  we  can  divide  the  total  change 
into  two  parts  : 


Av 


(v|j(e),e>  -  <y|d(g)a,r>+  <v|d(e)a,e>-  <^|d(e)a,e>a  (2.6.5) 


Av 


V 

a 

co 


Av 


V 

a 


The  decomposition  is  illustrated  in  Fig.  2.10  for  a  single  decision  vari¬ 
able  . 

The  loss  (2.6.1)  is  defined  as  the  compensation  : 


t  =  Av  3 

CO 


(2.6.6) 
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I 


The  loss  is  the  difference  between  the  expected  value  given  the  best 

A 

decision  d(£)  and  the  expected  value  given  the  inferior  decision 

i(Oa  • 

To  understand  why  Avq  should  not  be  included  in  the  loss,  suppose 


that  the  only  effect  of  the  approximation  is  to  add  the  fixed  amount 


a 


to  every  outcome  : 


<v{d,e>a  ■  <vjd,6>+  a 


Taking  the  gradient  of  both  sides  of  (2.6.7),  we  find 


a  a  a 

d(£)  -  d(e)  . 


(2.6.7) 


(2.6.8) 


Adding  aQ  to  every  outcome  changes  the  calculated  expected  value  with¬ 
out  changing  the  decision.  Using  (2.6.7)  and  (2.6.8)  in  (2.6.5),  the 
two  terms  are 


,  a 

Av  =0 
co 


Av 


a 


-  a 


(2.6.9) 

(2.6.10) 


Just  as  information  only  has  value  if  it  can  change  the  decision,  er¬ 
rors  only  cause  losses  if  they  affect  the  decision.  Consequently,  the 

SL  cl 

expected  economic  loss  is  Av  ,  not  Av 

co 

Quadratic  Terms 

We  now  derive  expressions  for  Av*q  and  Av®  for  the  quadratic 
case.  The  expected  value  of  v  given  d  and  [s|£}a  is 

<v|d,£>  =  a  +  b '  <s  |  £  >a  +  -|  <s'E  s  j£>®  +  <s  |£>a  G  d+j  d'Hd  . 


(2.6.11) 


Maximizing  (2.6.11),  we  have 
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d(£)a  =  -  H_1G'  <s  |£>a  . 


(2.6.12) 


Substituting  (2.6.12)  into  (2.6.11)  yields 

<v|d(£)a,e>a  =  a  +  b'<s  |£>a  +  <s'E  s|£>a 

-  <s'|e>a  G  H-1G*  <s|£>a  .  (2.6.13) 

Substituting  (2.6.12)  into  <v|d,£>  from  (2.3.11),  we  have 

<vld(£)a,£>  =  a  + -|  <3'F  s  j£>  -  ^  <s*  |£>a  G  H_1G'  <s  |£>&  .  (2.6. 1H> 

From  (2.6.14)  the  prior  solution  is 

<v|d(£)  ,£>  =  a  +  <s*E  _s  |£>  .  (2.6. 15) 

Subtracting  (2.6.14)  from  (2.6.15)  and  (2.6.13)  from  (2.6.14)  yields 
the  desired  terms  : 

Ava  «  <s'  |£>a  G  H_1G'  <s|£>a  (2.6.16) 

AVq  *  -  b'  <s  |£>a  +  <s'E  _s  j£>  -  ^  <s'E  ^|£>&  (2.6.17) 

To  evaluate  a  computational  procedure  we  must  be  careful  to  dis¬ 
tinguish  between  open  loop  changes  and  compensatory  changes.  Since  the 
open  loop  changes  do  not  affect  our  decision,  eliminating  them  only 
satisfies  curiosity.  It  is  the  compensatory  changes  that  we  are  willing 
to  pay  to  eliminate. 
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CHAPTER  3 


VALUE  OF  ANALYSIS  FOR  A  RISK  SENSITIVE  DECISION  MAKER 

3.0  Introduction 

The  object  of  this  chapter  is  to  extend  the  results  of  Chapter  2 
to  a  risk  sensitive  decision  maker.  We  calculate  the  value  of  clair¬ 
voyance  for  an  exponential  utility  function  and  a  quadratic  value  func¬ 
tion.  The  result  is  not  a  practical  one  because  it  involves  third  and 
fourth  covariances  which  are  difficult  to  encode.  However,  the  result 
provides  a  basis  for  determining  conditions  under  which  the  expressions 
from  Chapter  2  are  valid.  In  the  final  section  of  the  chapter  we  com¬ 
pute  the  loss  if  risk  preference  is  omitted  from  a  decision  analysis. 

3. 1  Preliminaries 

The  phenomenon  of  risk  preference  is  well  known.  Pratt  [5]  and 
Howard  [2]  give  excellent  treatments  of  the  subject.  Basically,  the 
decision  maker  assigns  a  utility  function  which  gives  a  number  u  to 
every  possible  value  v  .  Decision  alternatives  are  described  by  the 
probability  distribution  on  v  or  lottery  that  they  represent.  The 
fundamental  theorem  of  decision  theory  is  that  one  lottery  is  preferred 
to  another  if  and  only  if  the  expected  utility  of  the  first  is  greater 
than  the  second.  Therefore,  d^  is  preferred  to  d^  if  and  only  if 

<u(v)  |d1,C>  >  <u(v) |d2,C>  .  (3.1.1) 

The  certain  equivalent  of  a  lottery  is  defined  as  the  value  *"<v|C>  such 
such  that  the  utility  of  '~<v|£>  is  equal  to  the  expected  utility  of 
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the  lottery  : 


uC~<v|e>)  ■  <u(v)  je> 


(3.1.2) 


Risk  Aversion  and  Exponential  Utility 

Pratt  [5]  has  shown  that  all  of  the  essential  information  about  a 
utility  function  is  contained  in  the  local  risk  aversion  coefficient 
\(v)  : 


\(v) 


(3.1.3) 


A  constant  risk  aversion  coefficient  y  implies  the  exponential  utility 
curve 


u(v) 


(3.1.4) 


The  local  risk  aversion  for  other  utility  functions  can  be  conveniently 
represented  by  a  power  series  about  the  mean  of  the  lottery  <v  |£>  : 

y(v)  -  v(<v|e>)  +  (v  -  <v|e>)  +  ...  (3.1.5) 


From  (3.1.5)  we  see  that  any  utility  curve  can  be  approximated  by  the 
exponential  utility  curve  with 


y  “  y(^v|£>) 


(3.1.6) 


as  lorg  as  the  variance  of  v  is  not  too  large. 

The  exponential  utility  function  is  convenient  analytically  because 
it  has  the  delta  property;  if  a  fixed  amount  6  is  added  to  each  of  the 
prizes  in  a  lottery,  then  the  certain  equivalent  is  also  increased  by  6 


7T5 TT^ 


-<v  +  6|£>  =  ~“<v  |e>  +  6 


(3.1.7) 


The  Approximate  Certain  Equivalent 

An  approximation  for  the  certain  equivalent  is 

~<v  |£>  =  <v|£>  -  y(<vj£>)  <v|£>  .  (3.1.8) 

Expression  (3.1.8)  it  exact  for  exponential  utility  and  a  normal  lottery. 
Therefore,  for  lotteries  which  are  approximately  symmetric  and  not  too 
diffuse,  (3.1.8)  should  be  an  excellent  approximation. 

The  Risk-Sensitive  Value  of  Clairvoyance 

The  risk-sensitive  value  of  clairvoyance  is  defined  as  the  cost 
k  such  that  the  expected  utilities  with  and  without  clairvoyance  are 
equal  : 

<uik,C,£>-  <u|£>  (3.1.9) 

Using  the  delta  property  and  the  definition  of  clairvoyance,  it  is 
straightforward  to  show  that  for  exponential  utility  (3.1.9)  reduces  to 

k  «~<vc|£>«  ~“<v  |d*(C  ,£)  ,£>  -  ~<v|d(£),?>  .  (3.1.10) 

The  second  term  in  (3.1.10)  is  the  risk- sensitive  solution  to  the 

a 

basic  problem.  In  this  chapter  _d(£)  represents 

ji(£)  =  max  ^  ^<v(d,£>  .  (3.1.11) 

d 

When  we  wish  to  refer  to  the  d(£)  of  Chapter  2  that  maximizes  the  ex¬ 
pected  value  of  v  we  will  use  _dQ  since  we  showed  in  Chapter  2  that 

d  maximizes  <vld,£>  as  well  as  v(<s |£>,d)  . 

— o 

The  first  term  in  (3.1.10)  may  be  simplified.  Clairvoyance  allows 
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I 

us  to  maximize  d  after  _s  is  revealed  : 

d(C,£)  -  d(s)  (3.1.  12) 

Given  both  js  and  d  ,  v  is  known  deterministically;  therefore,  we 
have 

d(s)  ■  max  ^  ''"<v|s,d,£>  =  max  ^  v(s,d)  .  (3.1.14) 

-  d  d 

Since  the  resulting  certain  equivalent  is  a  function  only  of  _s  ,  we 
may  write 

— <V  |d*(C  ,£),£>■=  ~<v(s  ,_d(s))  |£>  .  (3.1.15) 

To  find  the  certain  equivalent  given  clairvoyance,  we  maximize  the 
determistic  value  function,  perform  a  change  of  variables  from  _s  to 
v  ,  and  find  the  certain  equivalent  of  the  resulting  lottery. 

3.2  The  Value  of  Clairvoyance  for  a  Risk- Sensitive  Quadratic  Problem 
Although  it  is  possible  to  definte  the  value  of  data  for  a  risk- 
sensitive  decision  maker,  the  resulting  expressions  are  so  complex 
that  we  gain  little  insight.  Instead,  we  examine  the  special  case 
wi.ere  the  data  is  clairvoyance  on  the  vector  of  state  variables.  By 
examining  the  conditions  under  which  the  risk-sensitive  value  of  clair¬ 
voyance  reduces  to  the  risk- indifferent  value  of  clairvoyance,  we 
learn  when  we  can  apply  the  results  from  Chapter  2. 

The  Derivation  of  the  Approximate  Value  of  Clairvoyance 

To  find  the  value  of  clairvoyance  (3.2.1),  we  know  from  (3.1.10) 
that  we  find  the  difference  between  the  certain  equivalent  with  clair¬ 
voyance  and  the  certain  equivalent  without. 


Starting  with  the  primary  problem,  we  want  to  find 


~<v|d(e),e>a  -  <v[d(e),e>  -  ±  v  ~<v[d(e)se>  .  (3.2.2) 

We  have  used  the  small  symbol  a  to  denote  an  approximation.  We  de¬ 
fine  d(£)  as 


J(£)3  =  max  1  ~<v|d,£>^ 


(3.2.3) 


Then  our  approximation  is 


~<v  (£),£>  =  ~<v 'J(£)  ,£>a  .  (3.2.4) 

To  find  ~<v(d,£>a  we  need  the  mean  and  variance.  From  (2.3.11) 
the  mean  is 

<v|d,£>  -  a  +  |  <s'E  _s|fi>  +  |  d '  H  d  .  (3.2.5) 

To  find  the  variance  we  square  expression  (2.3.8)  for  v(s,d)  : 

v2(s,d)  -  a  +  b's  s'b  +  |  (s'E  s)2  +  d'G's  s'G  d  +  (d'H  d)2 
+  2a  b'js  +  a  s'E  _s  +  2a  _s 'G  d  +  a  d'H  d  +  b's^  ^'E  ^ 

+  2b'j>  s.'G  d  +  b'^  d'H  d  +  s^'E  _s  js'G  d  +  s'E  s  d'H  d 


+  s'G  d  d'H  d 

Abbreviating  the  mean  vector  and  the  covariance  matrix, 

_s  =  <s  |fl> 


(3.2.6) 


C  =  _s '  j  £>  , 


(3.2.7) 

(3.2.8) 


the  expectation  of  (3.2.6)  is 
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<v2ld,£>  -  a  +  b'C  b  +  ^  <(s'E  s)2|£>+  d'G'C  G  d  +  -|(d'H  d)2 

+  a  <s'E  s|fi>  +  a  d'H  d  +  b'<s  s'E  s  |£>  +  2b 'C  G  d 

+  <s'E  s  s' |e>  G  d  +  ^  <s  'E  s j£>  d'H  d  .  (3.2.9) 

Squaring  <v|d,£>  from  (3.2.5)  we  have 

<v|d,£>2  =  a  +  -  <s'E  s  |e>2  +  4(d 'H  d)2+  a  <s'E  s  |£> 


+  ad'Hd+-^  <s'E  s  1 £>  d'H  d  . 


(3.2.10) 


Subtracting  (3.2.10)  from  (3.2.9)  we  calculate  the  variance  : 

v<v^,£>  =  b'C  b  +  ~  (<(s'E  s)2|e>  -  <3'Es|e>2)+  d'G'C  Gd 

+  b'  <s  s'E  s  |£>  +  2b 'C  G  d  +  <s'E  s  s'  |£>  G  d  (3.2.11) 
Combining  (3.2.5)  and  (3.2.11)  using  (3.2.2),  we  have 

~<v  [d,£>*  *  a  +  -^  <s'E  ^  |£>  -  v  (b'Cb+-|  (<(s'E  _s)2  |£> 

-  <s 'E  _s  |£>2^  +  b '  <s  j>'E  _s  |£>^ 

-  \  y  (2b 'C  +  <s'E  s  s'  j£>)  Gd^d'(H  -  y  G'C  G)  d  . 


(3.2.12) 


The  gradient  is 


V  ~<V  |d,£>a  =  --y  (b'C  +  ^  <s'E  s  s'  |£>)  G  +  d'(H  -  y  G'C  G)  . 


(3.2.13) 


Setting  the  gradient  to  zero,  we  find 


d(£)a  =  -y(H  -  v  G'C  G)-1G'(C  b  +  \  <s  s'E  s  |£>)  .  (3.2.14) 


Finally,  substituting  (3.2.14)  into  (3.2.  12),  the  solution  is 

~<vld(e)a,fi>a  -  a  +  -i  <s'E  s|fi>  -  ~2  y  (b'C  b  +  -|  (<(s'E  s)2|e> 

-  'E  _s|£>2^  -  b'<s  _s'E  _s|£>^ 

“  \  V2(b'C  +  ±  <s'E  s  s'  |£>)  G(H 

-  y  G'C  G)"1  G'(C  b  +  <s  s'E  s|£>)  .  (3.2.15) 

We  now  consider  the  other  term  in  (3.1.10),  the  certain  equivalent 
with  clairvoyance.  Expression  (3.1.14)  indicates  that  we  should  maxi¬ 
mize  the  value  function  given  _s  .  The  gradient  of  v(s^d)  is 

W(s,d)  *  _s'G  +  d'H  .  (3.2.16) 

Setting  the  gradient  to  zero  yields 

d(s)  =  -  H_1G's  .  (3.2.17) 

Notice  that  this  is  the  exact  maximum  since  we  have  not  yet  introduced 
the  certain  equivalent  approximation  for  this  case.  Substituting  into 
(2.3.8),  we  have 

v(s,d(s))  ■=  a  +  b's  +  \  s'(E  -  G  H'V)  s  .  (3.2.18) 

Proceeding  in  analogy  to  (3.2.5)  through  (3.2.11),  the  mean  and 
variance  are 

<v(s,d(s))  Jfl>  -  a  +  -|  <s'(E  -  G,H_1G')  s  |e>  ,  (3.2.19) 

v<v(s,d(s))  |e>  ■  b'C  b  +  (<(s'(E  -  G  H_1G')  s)2|£>-  <s'  (E 

-  G  H_1G')s  |e>y  b'<s  s '  ( E -  G  H_1G')  s  |£>  . 

(3.2.20) 


52 


Combining  (3.2.19)  and  (3.2.20),  using  (3.2.2),  we  have 

~^(s,d(s))  je>  -  a  +  <s'(E  -  G  H_1G')s  |e>  -  j  y  (b'C  b 

+  (<(s'(E  -  G  H_1G,)_s)2|e>  -  <s'(E-  G  H-1G')s  |6>2 

+  b' <s  s  '  (E  -  G  H_  ^ ‘ )_s  je>)  .  (3.2.21) 

Finally,  subtracting  (3.2.15)  from  (3.2.21)  we  have  the  result  : 

~<vc  |e>a  m  ~  \  <s'G  H_1G'_s  |e>  -  |  Y  (<(s'G  H_1G'_s)2|e> 

“  <s'G  H-1G's  |£>2  -  2  <s'G  H-1G's  s»E  s  |C> 

+  2  <s'G  H^G's  s^'E  s|C>2  +  b*  <s  s'G  H^G'js  |C>) 

+  i  V  (b'C  +  4'E  .  .'|e>)c(H  -  yG'C  Cl'VtCb 

+  -|^l'Es|e>)  (3.2.22) 

Discussion  of  the  Risk- Sensitive  Value  of  Clairvoyance 

We  now  examine  conditions  under  which  the  risk  indifferent  value 
of  clairvoyance  (2.4.7)  is  a  good  approximation  to  the  risk- sensitive 
value  of  clairvoyance  (3.2.22).  Since  for  y  equal  to  zero  the  two 
expressions  must  be  equal,  the  first  term  of  (3.2.22)  is  the  risk- 
indifferent  value  of  clairvoyance  : 

<vcie>--  \  <s'G  H-1G,s|e>  (3.2.23) 

Therefore,  we  want  conditions  such  that  the  first  term  of  (3.2.22) 
dominates  the  others. 

To  eliminate  third  and  fourth  covariances,  we  assume  'chat  the 
state  variables  are  normal  and  independent.  Under  these  assumptions 
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the  following  properties  are  true  : 


cov,  _  I 

<si>sj)e> 


i'j 


(3.2.24) 


v<si,sj,sk|e>  =  0  all  i,j,k 


(3.2.25) 


"’<VW*ile> 


3  sv2 
Si 


i“ J“k-  i 

i"ji  k mi 

i“k,  j*  i  or 
i-J l,  j“k 

otherwise 


(3.2.26) 


We  use  the  abbreviated  symbol  for  variance  s  ,  when  the  state  of  in¬ 
formation  is  understood  to  be  6  .  Using  (3.2.25)  to  eliminate  the 
third  covariances,  sufficient  conditions  for  the  first  term  of  (3.2.22) 


to  dominate  the  others  are  : 


|e>  »\  v  (<(£'£coi)2|e>-  <5'Ito£|e>2) 


(3.2.27) 


<v  |e>  »t  v  !<s'e  s  s'e  s |e>  -  <s'e  s  |e>  <s'e  s|e>  (3.2.28) 

r  '  a  1  I  —  —CO*  —  — 


<v  |fi>  »  —  G(H  ”  -y  G*C  G)  G’C  b  (3.2.29) 

where  we  have  used  the  definition  of  Jico  from  Chapter  2  . 

E  *  G  H_1G'  (3.2.30) 

—co - — 

Applying  (3.2.24)  and  (3.2.26),  the  conuitions  are  simplified  : 

^c  ^  1  \  \  2  y  v  / 1  o 


1  \  \  2  v  v 

4  LLe COij  sisj 


(3.2.30) 
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Ie>  1  c 


ni 


e  e  s  sf 
eCOij  ij  si  j 


(3.2.31) 


^c'e>  -1 

— »  J»'C  G(H  -  y  G'C  G)  G'C  b 


(3.2.32) 


Expressions  (3.2.30)  through  (3.2.32)  depend  only  on  the  first  and  sec¬ 


ond  order  partial  derivatives  of  v  and  on  the  covariances  of  the 


state  variables. 


One  State  Variable  and  One  Decision  Variable 


To  better  understand  the  three  conditions  (3.2.30),  (3.2.31),  and 


(3.2.32)  we  specialize  them  to  the  case  of  one  state  variable  and  one 


decision  variable  Recognizing  that  the  value  of  clairvoyance  may  be 


written, 


<vc|e> 


1  V 

-  e  s 

2  co 


(3.2.33) 


the  conditions  reduce  to 


-  »  <v  |e> 

Y  c  i 


(3.2.34) 


1  I  V  | 

~  »  le  s  I 


(3.2.35) 


1_  b  s 

■ 2 > 


(3.2.36) 


The  second  two  conditions,  (3.2.35)  and  (3.2.36),  are  no  surprise. 


Assuming  (3.2.34)  holds,  (3.2.35)  and  (3.2.36)  together  imply 


1  .2  v  ,  1  2v^  v  .  .  .  „ 

— ^  »  b  s  +  2e  s  v  <v|d«0,e>  . 


(3.2.37) 


I 


* 


( 

I 
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The  criterion  that  the  variance  be  much  less  than  the  reciprocal  of 
the  risk  aversion  coefficient  is  the  one  usually  invoked  to  justify 
the  certain  equivalent  approximation  (3.1.8)  (see  Howard  [2,  p.  513]). 

The  new  insight  is  that  the  value  of  clairvoyance  on  the  state 
variable  must  be  fmall  relative  to  the  reciprocal  of  the  risk  aversion 
coefficient.  Evei.  if  the  variance  criterion  (3.2.37)  is  satisfied,  a 
problem  is  unsuitable  for  approximate  analysis  if  strong  coupling 
between  the  state  and  decisiou  variables  causes  a  violation  of 
(3.2.34). 

3. 3  The  Approximate  Value  of  Risk  Preference 

In  this  section  we  consider  both  the  loss  from  deliberate  suppres¬ 
sion  of  risk  preference  and  the  gain  from  additional  assessment.  These 
are  important  quantities  for  the  applications  of  Chapter  5. 

The  most  interesting  result  of  this  section  is  that  the  risk  aver¬ 
sion  coefficieit  y  can  b*>  treated  as  a  random  variable.  The  approxi¬ 
mate  value  of  risk  preference  encoding  is  the  approximate  value  of 
information  on  y  ,  treating  y  as  if  it  were  a  state  variable. 
Deliberate  Suppression  of  Risk  Preference 

Deliberate  suppression  of  risk  preference  results  in  an  inaccurate 

A 

decision.  The  loss  from  using  the  risk- indifferent  optimum  dQ  instead 

A 

of  the  risk- sensitive  optimum  _d(£)  is 

A  «  ~<v|d(£),£>-  ^v[l0,e>  •  (3.3.1) 

For  a  single  decision  variable,  (3.3.1)  is  illustrated  by  the  lower 


curve  in  Fig.  3.1. 

As  discussed  in  the  previous  section  the  exact  certain  equivalent 


is  difficult  to  compute.  Therefore,  we  approximate  (3.3.1)  by 

xa  =  ~^[|(e)a,r>a  -  ~<vldo,e>a  ,  (3.3.2) 

where  the  superscript  a  denotes  the  certain  equivalent  approxima ti on 
(3.1.8).  The  first  term  in  (3.3.2)  is  (3.2.15)  and  the  second  term  is 
(3.2.12)  evaluated  at  _dQ  =  0  .  Consequen tly ,  we  have 

i  a  m  -  1  y(b'C  +  "5  <s'E  s  s'  |e-)  G(H  -  v  G'C  G)_1G' 

(£  b  +  \  I  i'E  1  |f  )  .  (3.  3.  3) 

If  we  were  only  interested  in  the  suppression  of  risk  aversion 
(3.3.3)  would  suffice.  However,  for  additional  risk  preference  assess¬ 
ment,  the  expression  analogous  to  (3.3.3)  is  difficult  to  derive. 
Therefore,  we  now  show  that  a  second  approximation  of  ;  is 

C=  v^o’e  *  (3.3.4) 


where  the  superscript  am  denotes  the  approximation  based  on  the  mean. 
The  approximation  is  illustrated  in  t he  top  curve  of  Fig.  3.1. 

To  motivate  t he  approximat ion  we  return  to  the  case  of  one  state 
variable  and  one  decision  variable.  ve  assume  that  condition  (3.2.34) 
holds ;  the  reciprocal  of  the  risk  aversion  coefficient  is  large  com¬ 
pared  to  the  value  of  clairvoyance  on  the  state  variable.  II;  ing  the 
definition  of  clairvoyance,  this  condition  implies  that 


n 


vts  |e->  . 


Differentiating  the  expre  sions  for  the  mean  and  variance, 


0.3.5) 

(3.2.  > )  and 
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(3.2.11),  we  see  that  (3.3.5)  implies 


d  <y  d,g> 


3d' 


1 

^2 


a2v<vld.e>] 
Sd2  1 


(3.3.6) 


Therefore,  when  condition  (3.2.34)  holds  the  variance  is  approximately 
linear  in  d  . 

In  Fig.  3.2  we  show  the  relationship  between  the  mean  and  certain 
equivalent  when  the  variance  is  linear  in  d  .  Using  (3.1.8),  the  cer¬ 
tain  equivalent  is  the  mean  minus  the  risk  premiun,  one  half  y  times 
the  variance.  At  dQ  the  slope  of  the  expected  value  is  zero;  there¬ 
fore,  the  slope  of  the  certain  equivalent  at  dQ  is  minus  the  slope  of 

the  risk  premium.  Likewise,  at  d(e)a  the  slope  of  the  certain  equiv¬ 
alent  is  zero,  making  the  slope  of  the  mean  equal  to  the  slope  of  the 

risk  premium.  By  (3.2.5)  and  (3.2.11)  both  the  mean  and  variance  are 

quadratic  in  d  .  Since  the  risk  premium  is  linear  in  d  ,  both  the 
mean  and  certain  equivalent  must  have  the  same  constant  second  partial 
derivative  with  respect  to  d  . 

The  conclusion  is  that  the  curves  for  the  mean  and  certain  equiva¬ 
lent  in  Fig.  3.2  are  identical  parabolas;  if  we  translated  the  curve 
for  the  certain  equivalent  to  the  r  .ght  by 


Ad  = 


d 

o 


-  d(C)a 


(3.3.7) 


and  upward  by 


Av  -  <v|do,e>  -  ^v|d(C)a,e>a  ,  (3.3.8) 


the  two  curves  would  coincide.  It  follows  that  the  two  approximations 
(3.3.2)  and  (3.3.4)  are  the  same  for  this  case  : 


(3.3.9) 
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The  Value  of  Additional  Assessment 

To  evaluate  the  value  of  additional  risk  assessment  we  consider  y 
as  a  random  variable.  Assuming  that  y  is  the  only  useful  data  from 
risk  preference  encoding,  we  would  like  to  compute 

<v^je>  -  ^ld*<y,e)  ,e>  -  ^v[d(e),e>.  (3.3.10) 

Unfortunately,  the  first  term  in  (3.3.10)  is  difficult  to  calculate 
since  the  expansion  rule  does  not  hold  for  certain  equivalents: 

ld*< Y,e) ,e> *  ^<v[|(y,e),Y,e>|e>  (3.3.11) 

Consequently,  we  define  the  approximate  expected  gain  : 

<v  je>am  ■  <v|d(e)a,e>  -  <v[d*(y,e)a,e> 

-  <v[J(e)a,e> -  <<v[d(y,e)a,y,e>|e>  (3.3.12) 

The  approximation  is  good  when  the  analog  to  (3.3.5)  holds;  the  Hes¬ 
sian  of  the  mean  H  should  dominate  the  Hessian  of  the  risk  premium 
-YfE'C  G  .  The  optimal  decisions  in  (3.3.12)  are  approximate  because 
they  are  based  on  the  certain  equivalent  approximation  (3.1.8)  and  be¬ 
cause  we  assume  that  the  solution  prior  to  encoding  can  be  found  by 
fixating  y  at  its  mean  : 


d(£)a  ■  m^x  1(<v[d,e>  -  <y|C>  ^v[d,e>^ 

(3.3.13) 

J(y»e)a  -  m^x  L(<vld,e>  -  y  ¥v[d,e>) 

(3.3.14) 

The  prior  solution  (3.3.13)  follows  from  Section  3.2  by  substitut¬ 
ing  <yje>  for  y  : 
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J(£)a  «  Z  <y|e> 


(3.3. 15) 


<v|d(e)a,e>a  -  a  +  |  <s'E  s|e>  +  |  z'H  2  <y|e>2  (3.3.16) 

where  for  notational  convenience  we  define 

z  -  (H  -  y  G'C  G)*  1  G'(C  b  +  -i  s'E  s|C>) 

The  solution  when  y  is  known  is 

d(v,e)a  *  z  Y 

<vli(\.e)a.\.e>  -  a  +  <s'e  s|e>  +  -|  _z'h  _z  y2  . 

Taking  the  expectation  of  (3.3.19)  we  have 

<<v|d(Y,e)a,Y,e>|e>  •  a  +  \  <s'e  s|e>  +  z'h  2  <Y2|e>.  (3.3. 20) 

Substituting  (3.3.20)  and  (3.3.16)  into  (3.3.12)  we  have  the 
result  : 


(3.3.17) 

(3.3.18) 

(3.3.19) 


<VYle>am  "  ’  1  - -  -  ^V|C>-  -  \  b'CG(H  -  YG'CG)'lG'  Cb  v<Y|e> 

(3.3.21) 


where  we  have  dropped  the  terms  involving  third  covariances  from  (3.3.21). 
Discussion  of  the  Value  of  Additional  Risk  Preference 


We  can  rederive  (3.3.21)  using  the  theorem  of  Chapter  2  and  the 


following  analogy: 


an 


an 


,an 


-  rv)d,e> 

(3.3.22) 

*  \  -  <v|e> 

(3.3.23) 

■  d 

(3.3.24) 
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where  the  superscript  an  denotes  analogy.  We  treat  the  certain 
equivalent  as  the  mean  or  objective  function  and  y  as  the  utate 
variable,  leaving  d  as  the  decision  vector.  Differentiating  the  ob¬ 
jective  function  and  evaluating  at  d  -  0  and  y  ■  <y|6>  ,  we  have 

H  -7  ^v[d,e>  -  H  -  <yje>G'CG  (3.3.25) 


Gan  -  7  .  b'C  4  <s'E  s  s'  |e>  .  (3.3.26) 

Notict  that  the  covariance  matrix  C  has  been  absorbed  into  the  coef- 
fici  its.  Finally  the  prior  variance  of  the  posterior  mean  of  the  state 
variable  is 


V<<s|D£>an|e>  -  V<<Yfy£>|e>  -  *y|e>  .  (3.3.27) 

Substituting  (3.3.25),  (3.3.26)  and  (3.3.27)  into  (2.3.9)  and  dropping 
the  terms  involving  third  covariances,  we  have 

|p^an  „  1  r>an  Tian'^  „an'  v 

<X  ^,|C>  «  -  ^  G  H  G  <y|e> 

■  -  \  b'C  (H  -  <yje>  G'C  G)"1  C  b  *Y|e>  .  (3.3.28) 

which  is  the  same  as  (3.3.21). 

The  conclusion  is  that  by  treating  y  as  if  it  were  a  state  vari¬ 
able  we  can  find  the  value  of  additional  encoding.  The  parameters  re¬ 
quired  other  than  <y|6>  and  v<-y|6>  are  the  same  ones  needed  to  compute 
the  value  of  clairvoyance  on  the  state  variables. 
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CHAPTER  4 


THE  DESIGN  OF  MONTE  CARLO  SAMPLING 


4.0  Introduction 

This  chapter  is  logically  separate  from  the  others.  To  under¬ 
stand  the  applications  in  Chapter  5,  the  reader  must  comprehend  this 
development  as  well  as  the  results  in  Chapters  2  and  3.  However,  in 
this  chapter  only  the  results  in  Section  4.1  are  important  to  the 
reader's  understanding  of  Chapter  5.  The  derivation  of  Section  4.2 
and  the  discussion  of  Section  4.3  involve  new  concepts  and  notation 
which  might  be  a  burden  to  the  casual  reader. 


4. 1  The  Expected  Loss  from  Incomplete  Sampling 

In  this  section  we  discuss  how  sampling  can  be  used  to  approxi¬ 
mate  the  optimum  decision  for  a  problem  with  a  single  decision  vari¬ 
able.  We  define  the  resulting  loss  as  the  difference  between  the  exact 
expected  value  and  the  approximate  one.  We  state  and  discuss  the  re¬ 
sult  for  a  quadratic  value  function  in  this  section,  leaving  the  deriva¬ 
tion  until  Section  4.2. 

The  deterministic  model  for  this  section  has  one  decision  variable 
and  many  state  variables  : 


v(s,d)  «  a  +  b's  +  _s'E  s  +  _s'.g  d  +  -|  h  d2  (4.1.1) 

where  we  have  modified  the  G  and  H  of  previous  chapters  to  £  and 
h  respectively.  This  reflects  the  change  in  dimensionality.  The 
object  of  the  sampling  program  is  to  maximize  the  expected  value  of  v 
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(4.1.2) 


max  <v|d,e> 

For  the  quadratic  problem  we  can  solve  (4.1.2)  exactly  using  the 
results  of  Chapter  2.  However,  for  more  complex  problems  the  exact 
solution  may  te  impossible  to  find,  and  we  might  use  an  approximate 
solution  based  on  sampling.  The  general  approach  is  to  discretize  the 
decision  variable  and  then  to  find  the  approximate  expected  value 
<v|dj,e>  at  each  discrete  decision  setting  d^  .  We  can  generate  the 
i1*1  sample  at  the  decision  setting  by  choosing  a  random  sample 
from  the  probability  density  function  of  the  state  vector  [s|e)  and 
calculating  the  associated  value  of  v  : 

iVj  -  v(ts,dj)  (4.1.3) 

Sampling  in  this  manner,  we  can  solve  for  d(e)  without  ever  calculat¬ 
ing  the  probability  density  function  {v|d,C} 

Figure  4.1  illustrates  the  terminology  we  need  to  state  the  sampl- 
ing  problem  more  precisely.  The  n^  decision  settings  are  equally 
spaced  over  a  predetermined  range  2 A  .  The  range  is  centered  at  the 
deterministic  optimum,  which  is  zero  in  accord  with  previous  chapters: 

d0  “  0  (4.1.4) 

To  simplify  the  notation  and  derivation  of  Section  4.2,  we  have  assumed 
that  n^  is  odd  and  defined  the  new  parameter  L  ,  where 

nd  -  2L  +  1  .  (4.1.5) 

At  each  decision  setting  d  ,  n  random  samples  are  taken  from 
the  probability  density  function  {v |d^ ,6}  .  Then  a  quadratic  least 
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2L  +  1  discrete  settings  of  d 


Figure  4.1  Terminology  for  the  sampling  problem 
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squares  curve  is  fit  through  all  of  the  data  points.  The  maximum  of 
this  curve  d(£)a  approximates  the  optimum  decision  d(6)  . 

The  main  result  of  this  section  is  that  for  large  nd  the  expected 
loss  <i|N,e>  from  using  <J(£)a  rather  than  d(C)  is 


<i|N,e> 


(4.1.6) 


where 


N  -  nsnd  .  (4.1.7) 

The  loss  is  proportional  to  the  variance  at  d(£)  divided  by  the 
total  number  of  samples.  This  is  the  result  that  we  need  for  Chap¬ 
ter  5. 


Since  the  loss  in  (4.1.6)  depends  on  the  product  of  n  and  n,  , 

s  d 

we  would  not  expect  it  to  matter  whether  we  discretize  d  finely, 
taking  only  a  few  sample*;  per  setting,  or  whether  we  discretize  d 
coarsely,  taking  many  samples  per  setting.  However,  this  conclusion 
is  only  valid  for  large  nd  .  In  other  words,  very  fine  discretization 
is  approximately  equivalent  to  fine  discretization.  In  Section  4.3 
we  show  that  fine  discretiz ition  always  results  in  a  smaller  expected 
loss  than  coarse  discretization. 

Finally,  let  us  recognize  that  (4.1.6)  is  based  on  a  simple  sampl- 
ing  program.  More  complicated  procedures  would  use  higher  order  curve 
fits  and  would  combine  prior  data  with  sample  data.  As  long  as  the 
range  A  is  not  too  large  and  the  sample  size  N  is  not  too  small, 
those  sophisticated  techniques  should  still  exhibit  the  insensitivity 
of  our  simple  approach. 
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4. 2  Proof  that  the  Expected  Loss  Is  Inversely  Proportional  to  the 
Total  Number  of  Sample  Points 

To  derive  expression  (4.1.6),  we  need  to  introduce  some  additional 
concepts.  Least  squaves  curve  fitting  is  most  conveniently  analyzed 
in  terms  of  orthogonal  polynomials.  Any  function  f(d)  that  can  be 
expanded  in  a  Taylor  series  can  also  be  expanded  in  terms  of  orthogonal 
polynomials  : 


m 


f(d)  -  y,  \ 


(4.2.1) 


k-0 


The  b^'s  are  the  coefficients,  and  the  cf^'s  are  the  polynomials.  The 
number  of  terms  m  is  the  order  of  the  fit. 

Ralston  [8]  shows  that  the  first  three  orthogonal  polynomials  are: 


%  " 


9i 


d/A 


L  +  1  A 
^  =  '  2L  -  1  + 


2 

3d  L 


(4.2.2) 

(4.2.3) 

(4.2.4) 


(2L  -1) A 

where  A  and  L  are  defined  in  Section  4.1.  If  we  sample  f(d)  at 

the  points  d^  ,  where  j  varies  over  the  integers  from  -L  to  +L  , 

the  sample  coefficients  are  defined  as 

+L 


I  f(V 


(4.2.5) 


k  L 

1 

j-L 

Ralston  shows  that  the  sample  coefficients  are  unbiased;  that  is,  if 
we  know  the  exact  coefficients  in  (4.2.1),  then 
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^kl^-  bk  • 


(4.2.6) 


Ralston  also  computes  the  variances  of  the  coefficients  for  the  case 
where  the  variance  of  f(d)  is  independent  of  d  : 

<f|<£>-  v<fje>-a2  (4.2.7) 

The  first  three  variances  are  : 


b„  - 


2L  +  1 


(4.2.8) 


va 


3L  a 


(L  +  1) (2L  +  1) 


(4.2.9) 


Ml 

b2  "  JZL  +  3)(2L  +  2)  (2L  +  1) 


10L(2L  -  l)g 


(4.2.10) 


By  a  derivation  similar  *-o  the  one  for  the  variances  of  the  b's  (see 
Ralston  [8,  p.  247]),  it  is  straightforward  to  show  that  the  covari¬ 
ances  are  zero  : 

C°V<\bx|e>  -  0  ,  k  +  l  (4.2.11) 

Specializing  Ralston's  results  to  our  problem,  <v|d,C>  plays  the 
role  of  f (d)  .  Taking  the  expectation  of  (4.1.1),  we  have 

<v  jd,C>  -  a  +  -|  <s'E  _s  )C>  +  -Jhd2  .  (4.2.12) 

Fitting  this  curve  using  the  orthogonal  polynomials  (4.2.2),  (4.2.3) 
and  (4.2.4),  the  exact  coefficients  are  : 

b0  "  a  +  \  <l'l  1  iS>  +  ^  A  (4.2.13) 


(4.2.14) 


(4.2.15) 


_  (2L  -  I)h  A2 
°2  6L 

By  direct  calculation  or  by  reference  to  Chapter  2,  the  exact  solution 
is 

mgx  <v|d,G>  ■  <v|d(e),e>  -  a  + -jj  <s'E  s|e>  .  (4.2.16) 

Suppose  that  instead  of  calculating  the  coefficients  exactly,  we 

approximate  them  bv  the  sampling  procedure  suggested  in  Section  4.1. 

Define  ^Vj  as  the  ith  sample  at  the  decision  point.  Then  we 

can  compute  the  sample  expectation  at  each  decision  setting  as 

n 

s 

<v|dj,e>a«^  iVj/ns  ’  (4.2.17) 

i=l 

where  n  is  the  number  of  samples  per  decision  setting.  If  we  sub- 
s 

stitute  (4.2.17)  into  (4.2.5)  as  f(dj)  and  take  the  expectation,  we 
find  after  some  algebraic  manipulations, 

<ba|e>  -  bk  ,  k  -  0,1,2  ,  (4.2.18) 

where  the  b^'s  are  defined  by  (4.2.13),  (4.2.14)  and  (4.2.15).  In 
other  words  <v|dj,G>a  as  defined  by  (4.2.17)  is  an  unbiased  estimator 
of  <v|dj,G>  ,  and  therefore  Ralston’s  results  apply. 

The  expected  loss  from  using  the  approximate  solution  is 

<*|N,e>  -  <v|<hg) ,e>  -  <<v|d(e)a,e>a|e>  .  (4.2.19) 

The  first  term  on  the  right  of  (4.2.19)  is  the  exact  solution  from 
(4.2.16).  The  second  term  is  the  expected  solution  based  on  sampling. 
To  find  an  expression  for  it,  we  define  the  approximate  expected  value 
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based  on  the  sample  data  as 


<*|d,e>a  -  £  b*  c^(d) 


(4.2.20) 


Maximizing  (4.2.20),  using  (4.2.2),  (4.2.3)  and  (4.2.4)  for  the  poly¬ 
nomials,  we  find  the  approximate  optimal  decision  d(6)a  : 


Hzy 1  - 


2L  -  1  J. 
6L  .a 
b2 


(4.2.21) 


Substituting  (4.2.21)  into  (4.2.20),  and  simplifying,  we  have 

<v|d(e)a,e>a  -  b*  -  iii  ba  .  2k:  1 

I  vw  D0  2L  l  o2  12L 

b2 


(4.2.22) 


To  take  the  expectation  of  (4.2.22),  we  expand  the  reciprocal  of  ba 
in  a  Taylor  series  about  the  mean  b2a  : 


b2  b2 


<b2>' 


*  -  V>2) 


(4.2.23) 


Taking  the  expectation  of  (4.2.22)  and  using  (4.2.23),  we  have 


<<v  |d(C)a,e>a  |e>  »  b0a  +  iji-i  b 


r  a  ,  L  +  1  -  a  2L  -  11 


a  ZL  -  1  ’I  /,  “2  \ 

2  '  l2b  V 


(4.2.24) 


Recalling  that  the  means  are  given  by  (4.2.13)  through  (4  2.15)  anc 
using  (4.2.8)  through  (4.2.10)  for  the  variances,  (4.2.24)  becomes 


<<v|d(e)a,e>|e>  -  a  +  ^  <s'e  s|e> 


(2L  +  2) (2L  +  l)h  A 


(4.2.25) 


(2L  +  3) (2L  +  2) (2L  +  1)(2L)(2L  -  l)h2  A* 
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Expression  (4.1.6)  follows  directly  from  (4.2.25)  by  letting  L 
hence  nrf  become  large. 


and 


4. 3  Discussion  of  the  Expected  Loss  from  Rough  Quantization 

Expression  (4.2.22)  provides  a  more  sophisticated  basis  than 


(4.1.4)  for  discussing  rough  quantization.  We  define  the  sharpness 
of  the  maximum  at  d(C)  as 


where 


(4.3.1) 


av  -  <v|d(e),e>-  <v|d«A,e>  (4.3.2) 

a2-  <v|d(C)  ,e>  .  (4.3.3) 

Using  (4.2.10)  for  the  quadratic  value  function,  (4.3.1)  becomes 

1  h  A2 

^  la  ’  (4.3.4) 

Using  (4.3.4),  the  final  term  in  (4.2.25)  may  be  written 


_ 180  u  r _ 

(2L  +  3) (2L  +  2) (2L  +  1)(2L)(2L  -  1) 


(4.3.5) 


For  large  N  ,  this  term  represents  the  error  that  is  introduced  by 
using  (2L  +  1)  quantization  levels  instead  of  F  levels. 

The  term  in  (4.3.5)  is  plotted  in  Fig.  4.2.  We  sec  that  for 
sharp  maxima  rough  quantization  introduces  little  error.  For  maxima 
that  are  not  sharp,  the  error  is  larger  but  insensitive  to  quantization. 
We  conclude  that  although  fine  quantization  is  always  better  than  rough 
quantization,  for  practical  purposes  the  sampling  error  depends  only  on 
the  total  number  of  samples  and  not  on  the  coarseness  of  the  quantization 
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X  Sharpness 

a 

\-  V 

Percentage 

Number  of  decision  levels 


Relative  to  one  sample  per  level  for  large  N 


Figure  4.2  Quantization  error  for  a  decision  variable  as  a  function  of 
the  sharpness  of  the  maximum  and  the  number  of  quantization 


CHAPTER  5 


APPLICATIONS 

5.0  Introduction 

This  chapter  is  the  focal  point  of  the  thesis.  In  Sections  5.1 
through  5.4  we  apply  the  results  of  Chapters  2,  3  and  4  to  develop  a 
systematic  design  framework.  It  applies  to  problems  with  continuous 
decision  variables.  The  design  is  optimal  in  the  sense  that  our  goal 
is  to  make  the  marginal  benefit  of  additional  analysis  equal  to  marginal 
cost.  In  section  5.5  we  discuss  how  the  framework  could  be  modified  to 
apply  to  budget  constrained  design  and  to  problems  with  discrete  deci¬ 
sion  variables. 

5. 1  Preliminary  Analysis 

In  Sections  5.1  through  5.4  we  consider  the  problem  introduced  in 
Section  2.3.  The  deterministic  model  v( s,d)  can  be  approximated  by 
a  second  order  Taylor  series  about  the  mean  of  the  state  variables  and 
the  prior  optimum  decision.  The  model  v(s,d)  may  be  very  complex. 

A  single  evaluation  of  v(s,d)  on  a  computer  may  cost  many  dollars. 

To  perform  an  exhaustive  probabilistic  analysis  to  find  the  exact 
optimum  decision  d(6)  would  be  prohibitively  expensive  o»r  objec¬ 
tive  is  to  use  preliminary  data  to  identify  cost  effective  additional 
analyses.  As  input  to  our  framework  we  require  roughly  encoded  param¬ 
eters  and  deterministic  sensitivity  data.  As  output  we  reconmend  the 
level  of  encoding  for  state  variables,  the  proper  detail  of  the  treat¬ 
ment  of  risk  preference,  and  the  amount  of  computation.  We  use 
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approximations  to  keep  the  cost  and  input  requirements  of  our  analysis 
to  a  minimum. 

The  Three  Steps  in  the  Preliminary  Phase 

Figure  5.1  summarizes  the  preliminary  phase.  Deterministic  sen¬ 
sitivity  analysis  yields  the  deterministic  optimum  dQ  and  the  first 
and  second  partial  derivatives  of  the  state  and  decision  variables  at 
the  operating  point  (s.d^)  .  For  a  general  discussion  of  sensitivity 
analysis,  see  Howard  [1],  For  a  specific  discussion  of  the  conversion 
of  sensitivity  data  to  approximate  partial  derivatives,  see  Howard  [2, 
Appendix  A]. 

Once  we  have  the  sensitivity  data  we  can  normalize  the  state  and 
decision  variables  so  thct  they  are  zero  at  the  mean  and  deterministic 
optimum  respectively  : 


£  ■  0 

(5.1.1) 

I  -  0 
■o  — 

(5.1.2) 

This  step  is  not  essential,  but  it  simplifies  notation  and  data  handl¬ 
ing.  Using  the  sensitivity  data  and  the  normalized  variables,  we  can 
fit  a  quadratic  Taylor  series  model  to  v(s,d)  at  (s,d)  «=  (0,0)  : 

v(s,d)  *  a  +  b's  +  j  s'E  s  +  s'G  d  +  -^  d'H  d  (5.1.3) 

The  final  step  is  to  encode  the  matric  of  covariances.  Howard  [2, 
p.  511]  suggests  an  encoding  technique.  In  many  problems  the  state 
variables  will  be  independent,  reducing  the  task  to  encoding  the  vari¬ 
ances  of  the  state  variables.  In  this  case  we  may  directly  encode 
rough  estimates  of  the  variances,  or  we  may  estimate  them  based  on  the 


DETERMINISTIC  SENSITIVITY  ANALYSIS 


Perform  deterministic  sensitivity  analysis  to  find  d  the 
deterministic  optimum  and  the  first  and  second  order- partial 
derivatives  at  d  *  dQ  and  j>  *  j 

Ref:  Howard  [2,  Appendix  A] 
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NORMALIZE  AND  FIT 

Normalize  the  state  and  decision  vectors  such  that 


^  ■  0 

a  -  o 

and  fit  0  . 

v(s  ,d)  *  a+b'j;  +  s'E  s  +  ^'G  d  +  -^  d'H  d 

Ref:  Section  2.3 

I  ~  ' 

ENCODE  COVARIANCES 
Encode  the  covariance  matrix 
Ref:  Howard  [2,  p.  511] 

or 

Encode  A  and  default  to 

covij  -  0,  i  +  j 

covij  "  ®i  "  ^2/A  »  1  "  J 

Ref:  Section  5.2 


, - 1 - 

(  GO  TO  ENCODING 

V  DECISION 


Figure  5.1  Preliminary  analysis 


ranges  used  for  the  sensitivity  analysis.  Conversion  from  a  range  to  a 
variance  involves  the  parameter  A  which  is  discussed  in  Section  5.2. 


Illustrative  Example.  The  Entrepreneur's  Problem 

To  illustrate  the  methodology  of  this  chapter  we  shall  apoly  each 
new  procedure  to  the  Entrepreneur's  Problem,  which  was  introduced  in 
Chapter  2.  Expressing  Che  profit  tt  in  millions  of  dollars  and  denot¬ 
ing  the  cost  Ac  as  s^  and  the  quantity  Aq  as  s^  ,  the  coefficients 
of  (5.1.3)  are  : 


a  -  198  (5.1.4) 

b'«  (-1  17.58]  (5.1.5) 

(5.1.6) 


G'-  [0  0.835]  (5.1.7) 

H  -  [  -3.67  ]  (5.1.8) 

The  covariance  matrix  and  the  risk  aversion  coefficient  are  : 


E  - 


0 

0 


0 

0.0497 


C 


10,000 

0 


(5.1.9) 


Y  -  0.004 


(5.1.10) 


The  Validity  of  the  Approximations 

Once  we  complete  the  preliminary  analysis,  we  check  to  roe  if  the 
approximations  developed  in  Chapters  2  and  3  are  applicable.  Figure  5.2 
illustrates  the  formal  checks.  In  Section  3.2  we  developed  expressions 


Calculate 


Preliminary  data 


FLAG 


Encode  risk  aversion 
coefficient  v 


<  10*FLAG 


TRUE 


Print 

WARNING 


<  10*  FLAG 


<  1-*FLAG 


FALSE 


Figure  5.2  Checks  to  see  if  approximations  are  valid 


for  three  quantities  that  should  be  small  relative  to  the  value  of  in¬ 
formation  of  the  state  variables.  If  these  conditions  are  not  met, 

Fig.  5.2  indicates  that  the  user  should  be  warned.  In  this  situation, 
careful  modification  of  the  framework,  perhaps  using  the  assumption  of 
Section  3.2  that  state  variables  are  normal  and  independent,  may  sal¬ 
vage  the  analysis.  To  make  the  framework  general  enough  to  handle 

violations  of  fhe  conditions  would  make  it  too  complex  to  be  of  practical 
use. 

An  informal  check  should  be  applied  at  this  point.  The  expres¬ 
sions  of  Chapters  2  and  3  assume  that  the  probability  density  func¬ 
tions  are  roughly  centrally  symmetric.  If  the  user  feels  that  any 
third  covariances  are  large,  he  should  proceed  with  caution.  The  ex¬ 
pressions  of  Chapter  3,  particularly  (3.2.22),  should  help  the  user  to 
assess  the  seriousness  of  an  asymmetry. 

Application  of  the  Checks  to  the  Entrepreneur's  Problem 

Applied  to  the  Entrepreneur's  Problem  the  three  checks  of  Fig. 

5.2  are  : 

<vc|e> 

—  -  26*  FLACl  (5.1.11) 

<vcle> 

- —  -  50*  FLAG2  (5.1.12) 

<vc|e> 

y  *  2*  ^AGj  (5.1.13) 

As  we  shall  see  in  Section  5.3  the  low  value  of  2  in  the  third  check 

is  an  early  warning  that  the  Entrepreneur's  Problem  is  highly  risk  sen¬ 
sitive  . 

To  check  for  the  impact  of  asymmetry,  suppose  that  the  probability 
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distribution  on  Aq  is  lognormal  with  the  same  mean  and  variance  as 
before.  The  lower  bound  of  the  probability  density  function  is  -68 
corresponding  to  q  =  0  .  From  (3.2.22)  third  covariances  are  unim¬ 
portant  if 

<s'E  s  s' |e>  «  b'C  .  (5.1.14) 

Computing  these  quantities  the  inequality  is 

[0  26]  «  [0  1758]  .  (5.1.15) 

The  assumed  asymmetry  produces  negligible  changes  in  the  results. 

5 . 2  Encoding  State  Variables 

The  first  analytical  option  is  whether  to  gather  additional  data 
about  the  state  variables.  In  theory  the  data  about  the  state  vari¬ 
ables  could  come  from  a  variety  of  sources;  from  a  simulation  model, 
from  an  experiment,  or  from  an  expert.  As  long  as  the  prior  variance 

of  the  posterior  mean  <<v  |D,£>|S ]>  can  be  assessed,  the  eva  jation 
scheme  of  Fig.  5.3  applies.  In  practice  the  most  likely  application  is 
where  the  data  comes  from  an  encoding  interview  with  the  decision  maker 
or  his  designated  experts.  At  the  end  of  this  section  we  discuss  how 
the  input  data  might  be  generated  for  this  rase. 

The  iterative  encoding  procedure  of  Fig.  5.3  applies  when  the 
state  variables  are  independent  and  the  cost  of  encoding  each  variable 
is  a  constant  K  .  If  the  encoding  costs  were  a  function  of  n  the 
number  of  questions  in  an  encoding  interview,  then  the  preposterior 
variance  would  have  to  be  specified  as  a  function  of  n  .  This  would 
require  experimental  research  on  how  the  mean  of  a  decision  maker's 
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distribution  varies  during  an  interview.  We  leave  this  topic  for 
further  research.  For  our  analysis  we  assume  that  we  either  use  the 
preliminary  estimate  or  encode  a  distribution  which  the  decision  maker 
accepts  as  representing  his  state  of  information. 

”he  calculation  for  the  value  of  the  data  is  based  on  the  ranking 
scheme  from  Chapter  2.  As  tming  that  encoding  a  probability  distribu¬ 
tion  on  one  variable  gives  no  information  about  other  variables,  the 
value  of  data  on  the  i^  variable  <vn  ,1£>  is  given  by  (2.  A.  8) 
specialized  to  the  case  where  only  the  ith  variance  is  non- zero: 


<vDl|e>-  -  •5£lH'V<<s1K,e>le> 


(5.2.1) 


The  heart  of  the  encoding  procedure  is  the  iterative  loop  in  the 
middle  of  Fig.  5.3.  We  sort  the  values  of  encoding  to  form  a  list  with 
the  largest  value  <vD^ |£>  at  the  top.  If  the  value  of  encoding  exceeds 

the  cost  we  encode  the  jth  variable  completely,  yielding  a  new  mean  and 

variance.  The  variance  (<s  |D.. ,£>|£>  is  zero  after  the  complete  en¬ 
coding,  dropping  the  j*"  variable  to  the  bottom  of  the  list.  We 
coitinue  until  the  maximum  value  of  encoding  is  less  than  the  cost. 

The  iterative  procedure  is  based  on  the  assumption  that  the  joint 
value  of  clairvoyance  on  two  state  variables  is  equal  to  the  sum  of  the 
individual  values  of  clairvoyance.  The  assumption  is  good  if  the  joint 
value  does  not  greatly  exceed  the  marginal  values. 

At  the  end  of  Fig.  5.3  we  set  the  variance  of  any  unencoded  vari¬ 
able  to  the  sum  of  the  point  estimate  of  the  variance  and  the  prior 

variance  of  the  posterior  mean.  These  quantities  are  discussed  in  Sec¬ 
tion  2.4. 
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Encoding  Variances 


The  key  to  the  encoding  decision  is  estimating  |D1 ,6>|e^  and 

^<Bi^Di’®>l^  "  Slnce  both  are  measures  of  the  dispersion  of  the 

distribution,  they  are  related  to  ,  the  difference  between  the  high 


and  low  estimates  for  s^  . 


If  i-6  uniformly  distributed  between  the  high  and  low  esti¬ 


mates,  the  variance  is 


(5.2.2) 


If  {sje}  is  normally  distributed  with  the  high  and  low  values  each 
three  standard  deviations  from  the  mean,  the  variance  is 


(5.2.3) 


A  reasonable  model  of  the  relationship  between  the  variance  and  ^ 


V  i 

<st  le>  "  T  ■ 


(5.2.4) 


Assuming  that  ^>t  ^  ,e>|g)  and  can  be  ex¬ 
pressed  as  divided  by  Ayg  and  Agy  respectively  and  using 


(2.4.19),  the  variance  is 


Sle>-4 


(5.2.5) 


where 


A  \e 


(5.2.6) 


Of  course,  assignment  of  Ayg  and  AgV  for  any  given  problem  depends 


on  how  the  decision  maker  or  his  designated  analyst  interpret  the 
terms  "high  value"  and  "low  value."  Hopefully,  during  future  decision 
analyses,  data  can  be  gathered  on  the  relationship  between  and 

aev  • 

When  we  assume  that  and  A^  are  the  same  for  each  variable, 

our  encoding  procedure  ranks  the  variables  by  their  approxinate  values 
of  clairvoyance.  Even  in  this  simple  case,  our  analysis  gives  us  in¬ 
sight.  A  common  misconception  is  to  assume  that  the  importance  of  a 
state  variable  is  measured  by  the  first  partial  derivative  of  the  value 
function.  In  fact,  the  first  partial  derivative  has  nothing  to  do  with 
the  value  of  data  for  the  risk  indifferent  problem  with  a  quadratic 
value  function. 

Encoding  in  the  Entrepreneur's  Problem 


Rather  than  specifying  the  prior  variance  of  the  posterijr  mean 
erectly,  we  parameterize  the  solution  on  the  ratio  r  : 


r  =  v<<vb.e>l£>  = 

<v|e>  \e 


(5.2.7) 


The  values  of  encoding  for  the  state  variables  in  the  Entrepreneur's 
Problem  are 


<VDL le>  “  0  (5.2.8) 

^02  le>  *  9.5  r  .  (5.2.9) 

The  value  of  encoding  the  cost  sL  is  zero  because  the  partial  deriva¬ 
tive  of  profit  with  respect  to  cost  and  price  is  zero.  Since  the 
Entrepreneur's  Problem  does  not  include  the  option  to  stop,  the  costs 
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will  be  incurred  regardless  of  the  pricing  decision.  There  is  no  value 
in  learning  the  exact  value  of  the  sunk  costs. 

The  cost  of  encoding  Kg  should  be  approximately  a  thousand  dol- 

_3 

lars  or  10  million  dollars.  Since  we  expect  r  to  be  in  the  range 
of  0.1  to  1.0  for  a  typical  problem,  the  decision  is  clearly  to  encode 
and  not  to  encode  s^  . 

5 . 3  The  Choice  of  Risk  Attitude 

Risk  preference  follows  the  encoding  of  state  variables  both  in 
the  chronology  of  decision  analysis  and  in  the  complexity  of  computa¬ 
tion.  In  this  section  we  choose  the  appropriate  risk  attitude  for  the 
probability  phase  based  on  preliminary  estimates  of  the  risk  aversion 
coefficient. 

The  options  for  the  probabilistic  phase  are: 

(i)  Linear  utility  where  y  is  zero,  the  decision  maker 
is  risk- indifferent. 

(ii)  Exponential  utility  where  y  is  fixed,  the  decision 
maker  has  constant  risk  preference. 

(iii)  Complete  utility  where  y  is  a  function  of  wealth, 
the  decision  maker  has  wealth- sensitive  risk 
preference. 

The  Flow  Chart 

Figure  5.4  summarizes  the  evaluation  of  the  alternatives.  The 
preliminary  attitude  is  summarized  by  the  risk  preference  coefficient 
*Y  •  The  potential  risk  pieference  coefficients  after  a  thorough  en¬ 
coding  of  the  decision  maker's  risk  attitude  and  an  accurate  computa¬ 
tion  ot  the  profit  lottery  are  summarized  by  y  . 

The  costs  and  include  education,  assessment,  and  compu¬ 

tation  relative  to  option  (ii).  The  negative  of  Kj^  is  the  savings 
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Encode  the  exact  utility 
function  u(v) 


Figure  5.4  Choice  of  risk  preference  alternative 
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that  would  occur  if  risk  preference  were  suppressed  in  the  probabilis¬ 
tic  phase.  K.22  is  the  additional  cost  of  encoding  the  decision  maker's 
complete  risk  preference  function  and  using  it  for  computations.  The 
logic  for  comparing  costs  and  gains  is  shown  at  the  bottom  of  Fig.  5.4. 
We  assume  that 

K23*K12  (5.3.1) 

a  id 

<vy|C>  <:  (5.3.2) 

so  that  level  (iii)  is  not  cost  effective,  if  level  (ii)  is  not. 
Discussion  of  Input  Data 

The  practical  application  of  Fig.  5.4  depends  on  an  accurate 
estimation  of  the  preliminary  risk  attitude  and  of  the  costs.  We  con¬ 
sider  each  below. 

The  preliminary  risk  attitude  can  come  from  several  sources. 

First,  we  could  encode  the  entire  risk  preference  function  from  an  as- 
sistent.  Second,  the  analyst  could  examine  the  risk  attitudes  of  simi¬ 
lar  decision  makers.  For  example,  we  expect  two  corporations  with  the 
same  assets  and  earnings  to  have  approximately  the  same  risk  attitude. 
Third,  and  most  promising,  we  can  use  modeling.  Suppose  we  assume  that 
u(v)  is  logarithmic  : 


u(v)  »  Ai(v  +  a)  (5.3.3) 

Then  we  can  ask  the  decision  maker  the  single  question: 

Suppose  you  had  the  opportunity  to  call  the  toss  of  a 
fair  coin.  You  win  a  dollars  if  you  call  correctly 
arid  you  lose  a/ 2  dollars  if  you  are  incorrect.  For 
what  price  a  are  you  indifferent  between  playing 
and  not  playing? 
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The  prize  a  is  the  same  as  the  parameter  a  in  (5.3.3),  so  that 
u(v)  is  completely  specified. 


The  future  risk  attitude  is  described  by  how  much  y  varies  from 
Y0  •  We  define  yQ  as  the  coefficient  which  would  make  the  certain 
equivalent  approximation  exact  given  the  correct  profit  lottery  and 
utility  function, 

~<v[d(e),e>  »  <vl_d(e)  ,e>  -  ~  yQ  <v[d(e) ,e>  .  (5.3.4) 

We  identify  three  ways  that  yQ  can  vary  from  y  . 

First,  the  approximate  mean  is  not  exact.  We  might  estimate  how 
much  the  risk  aversion  coefficient  changes  with  changes  in  the  mean  by 
assuming  that  the  true  utilik„  function  is  logarithmic.  The  varitnce 
of  the  difference  between  the  accurate  and  preliminary  means  of  the 
profit  lottery  could  then  be  used  to  impute  a  variance  in  y  .  The 
variance  of  the  profit  lottery  mean  is  discussed  in  the  next  section. 

Second,  the  variance  of  the  profit  lottery  may  be  large  enough 
that  exponential  utility  is  not  a  good  local  approximation.  We  can 
check  this  effect  by  assuming  again  that  the  true  utility  is  logarith¬ 
mic.  Then  we  can  calculate  how  much  we  would  have  to  change  y  to 
make  the  exponential  certain  equivalent  equal  to  the  logarithmic  one. 

Third,  even  if  the  decision  maker's  risk  attitude  is  adequately 
estimated  by  the  exponential  utility  function,  our  preliminary  estimate 
of  the  coefficient  may  be  wrong.  We  can  directly  encode  how  much  the 
mean  will  shift  during  an  encoding  session.  Just  as  for  state  variable 
encoding,  experimental  data  is  helpful  in  assessing  potential  mean 
shifts.  Spetzler  [8]  has  made  a  start  in  this  direction. 
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Computing  the  three  risk  coefficient  variations  described  above 


should  help  in  assigning  y  the  prior  variance  of  the  posterior  risk 


coefficient. 


Costs 


The  final  question  is  costs.  The  gains  from  going  from  level  (ii) 


to  level  (i)  should  not  be  large.  By  suppressing  risk  aversion,  the 


analyst  does  not  have  to  educate  the  decision  maker  or  assess  y  ac¬ 


curately.  Computationally,  exponential  utility  is  almost  as  tractable 


as  linear  utility.  Exponential  utility  has  the  delta  property;  if  a 


fixed  number  of  dollars  6  is  added  to  each  prize  in  a  lottery,  the 


certain  equivalent  of  the  lottery  is  increased  by  6  .  Because  of  the 


delta  property,  computation  with  exponential  utility  can  be  decomposed. 


The  profit  lottery  can  be  generated  without  considering  risk  aversion 


or  utill'y,  and  the  certain  equivalent  can  be  computed  later. 


By  going  from  level  (ii)  to  level  (iii),  we  find  that  the  costs 


rise  sharply.  The  education  and  assessment  costs  can  be  large,  espe¬ 


cially  if  the  decision  maker  is  an  organization.  Spetzler's  [8]  study 


indicates  that  accurate  determination  of  y  is  a  time-consuming  job. 


The  computational  burden  is  also  large  because  we  lose  the  delta  prop- 


Risk  Preference  in  the  Entrepreneur's  Problem 


The  key  inputs  for  the  Entrepreneur's  Problem  are  the  mean  and 


variance  of  y  .  From  Section  5.1  we  have 


y  =  0.004  . 


(5.3.5) 


To  estimate  the  variance  we  compute  the  asset  level  which  would  imply 


(5.3.5).  Combining  the  definition  of  the  local  risk  aversion  '"oef ficient 
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with  the  expression  for  logarithmic  utility  we  have 


y(v) 


_  u"(y) 


u' (v)  v  +  a  ’ 


(5.3.6) 


where  prime  denotes  differentiation.  Evaluating  at  <v|d(£),£>  =  200.5 
and  y  =  .004  ,  the  asset  level  a  is 


1 


0.004 


-  200.5  ■  49.5 


(5.3.7) 


Figure  5.5  shows  the  entrepreneur's  normalized  profit  lottery. 

The  expected  value  if  1.0  standard  deviations  above  zero.  The  prior 
assets  a  are  only  0.25  standard  deviations.  Therefore,  if  the  out¬ 
come  of  the  lottery  is  more  than  1.25  standard  deviations  belcw  the 
mean,  the  loss  will  exceed  the  entrepreneur's  assets,  resulting  in 
bankruptcy.  Assuming  normality,  there  is  an  lift  chance  of  bankruptcy. 

Given  a  finite  probability  of  bankruptcy  the  logarithmic  certain 
equivalent  for  the  luttery  is  negative  and  arbitrarily  large.  Because 
the  logarithmic  and  exponential  certain  equivalent  vary  widely,  the 
actual  risk  aversion  coefficient  may  be  quite  different  from  the  pre¬ 
liminary  estimate.  A  variance  that  reflects  this  uncertainty  is 


y  ■  0.0004  . 


(5.3.8) 


The  remaining  inputs  for  Fig.  5.5  are  and  .  The  sav¬ 

ings  from  using  level  (i)  -K  ,  is  the  cost  of  encoding  the  risk  aver- 

1.  4 

sion  coefficient.  Like  the  cost  of  encoding,  this  amount  is  approxi¬ 
mately  one  thousand  dollars  : 


-K12  =10  (5.3.9) 

We  assume  that  the  cost  of  encoding  and  using  the  complete  risk 


preference  function  is  ten  thousand  dollars  : 

K23  -  10'2  (5.3.10) 

Comparing  the  costs  and  benefits  as  required  by  the  decision  boxes 
of  Fig.  5.5,  we  find  that  the  best  decision  is  clearly  to  encode  and 
use  the  complete  utility  function. 


5 . 4  The  Choice  of  a  Computational  Alternative 

The  purpose  of  computation  is  to  accurately  estimate  the  optimum 

A 

decision  d(c)  .  The  options  considered  in  this  section  are  to  use 
<3(fi)a  approximate  optimum  from  Chapter  3  or  to  find  a  more  accurate 

estimate  through  Monte  Carlo  simulation.  The  analysis  is  limited  to 
problems  with  a  single  decision  variable. 

Using  the  preliminary  estimate  directly  is  the  best  alternative 
if  it  is  accurate  and  the  cost  per  evaluation  of  the  deterministic  model 
is  high.  In  this  case  our  preliminary  estimate  of  the  optimum  decision 
becomes  our  actual  decision. 

Monte  Carlo  sampling  is  described  in  Chapter  4.  Random  samples 
are  drawn  from  {s |S]  .  At  each  decision  setting  d^  ,  an  approxima¬ 
tion  to  <v|dj,6>  is  computed.  Then  a  least  squares  curve  is  fit 
through  the  means  and  maximized. 

Inability  to  F,stimate  Tree  Errors 

Logically,  we  would  include  a  decision  tree  as  a  computational 
alternative.  Decision  tiees  are  generated  by  discretizing  both  d  and 
[s  |6}  .  The  optimum  decision  is  computed  by  rolling  back  the  tree. 

Unfortunately,  the  optimal  tree  for  a  quadratic  value  function  has 
exactly  one  terminal  node.  The  problem  is  that  errors  for  trees  with 
two  or  more  branches  per  state  variable  depend  on  partial  derivatives 
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of  greater  than  second  order.  Since  v:e  have  suppressed  these  deriva¬ 
tives  by  using  a  quadratic  value  function,  we  have  lost  the  ability  to 
design  decision  trees. 

The  Evaluation  Scheme 

The  evaluation  scheme  is  shown  in  Fig.  5.6.  The  only  new  input 
required  is  the  cost  per  Monte  Carlo  sample  Kg  ,  which  normally  is 
dominated  by  the  cost  per  evaluation  of  the  deterministic  model  v(s,jl) 

In  Chapter  4  we  show  that  the  expected  loss  is  a  function  of  the 
total  sample  size  N  ,  even  if  d  is  roughly  quantized.  If  the  sample 
data  swamps  the  prior  data,  the  h*  computed  in  Fig.  5.6  is  the  op¬ 
timal  sample  size.  However,  if  N*  is  small  we  leave  it  to  the  user 
to  decide  whether  to  sample  or  not. 

Discussion 

Instead  of  leaving  the  final  choice  to  the  user,  we  could  encode 
the  prior  variance  of  the  posterior  mean 

v<<vjN,e>|C>  .  (5.4.1) 

The  sampling  scheme  of  Chapter  4  would  have  to  be  modified  to  include 
prior  data.  Then  the  calculation  of  optimal  N*  would  include  the 
decision;  stop  or  sample. 

However,  we  feel  that  encoding  the  quantity  in  (5.4.1)  is  as  dif¬ 
ficult  as  directly  deciding  to  sample  or  not.  In  most  practical  anal¬ 
yses  we  suspect  the  decision  will  clearly  be  to  sample.  In  this  case 
the  sample  data  will  swamp  the  prior  data,  and  our  evaluation  scheme 
eliminates  unnecessary  encoding. 
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Computation  in  the  Entrepreneurs  Problem 

Suppose  we  decide  to  compute  the  Entrepreneur's  profit  for  prices 
from  15  to  35.  If  the  cost  per  sample  Kc  is  $.10,  then  the  optimum 
sample  size  N*  is 


N*  -  820  .  (5.4.2) 

Since  we  evaluated  v(s,d)  far  fewer  than  820  times  during  the 
preliminary  analysis,  we  expect  the  Mon;e  Carlo  data  to  swamp  our  prior 
knowledge.  Therefore,  the  results  of  Chapter  4  hold,  and  the  decision 
is  clearly  to  sample.  The  cost  of  the  sampling  program  is  $80,  neg¬ 
ligible  compared  to  the  encoding  costs. 

Conclusion  of  the  Entrepreneur's  Problem 

Our  framework  has  given  us  insight  into  the  Entrepreneur's  Problem. 
Although  computation  is  necessary,  it  is  routine.  Encoding  the  state 
variable  Aq  is  important.  Using  the  rough  estimate  of  (Aq  |C]  would 
result  in  an  expected  loss  of  about  57.  of  the  expected  value  of  the 
lottery.  The  crucial  issue  is  risk  preference.  Because  the  lottery  is 
the  entrepreneur's  largest  asset,  we  expect  risk  preference  to  be  the 
dominant  issue.  The  large  value  of  additional  risk  assessment,  507.  of 
the  expected  value  of  the  lottery,  indicates  that  the  framework  has  pin¬ 
pointed  the  critical  issue. 

5 . 5  Extensions  to  Budget  Constrained  Design  and  to  Discrete  Decisions 
Both  budget  constrained  design  and  discrete  decisions  should  be 
straightforward  to  include  within  our  framework. 

For  budget  constrained  design  we  need  to  compute  an  approximate 
cost  benefit  ratio  for  each  option.  For  encoding  we  compute  the  expected 
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value  of  sample  data  divided  by  the  cost  Kg  for  each  variable.  For 
risk  preference  we  divide  the  gain  from  going  from  level  (i)  to  level  (ii) 
l  by  the  cost  -K^  ’  Similarly,  we  compute  the  ratio  of  <v^|C>  to 
K23  for  the  level  (ii)  to  level  (iii)  transition.  For  Monte  Carlo  com¬ 
putation  the  benefit  to  cost  ratio  is  computed  as  a  function  of  the  num¬ 
ber  of  samples  N  in  Fig.  5.6. 

Budget  constrained  design  would  balance  the  overall  effort  in  each 
of  the  three  areas.  As  nearly  as  the  discrete  nature  of  the  options 
would  allow,  the  last  variable  encoded  and  the  last  sample  taken  would 
have  the  same  benefit-cost  ratio  as  the  risk  preference  level  chosen. 

The  framework  could  be  modified  to  treat  problems  with  discrete 
decision  variables  by  basing  the  value  of  information  on  dot’d  loop 
sensitivities  as  in  Section  2.5.  The  risk-preference  evaluation  would 
require  a  closed  loop  risk  sensitivity.  The  comparison  of  Monte  Carlo 
and  trees  is  simplified  since  the  decision  variable  is  already  dis¬ 
cretized.  However,  tree  errors  remain  difficult  to  compute. 
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CHAPTER  6 


SUMMARY  AND  SUGGESTIONS  FOR  FUTURE  WORK 

The  objective  of  the  thesis  was  to  create  a  paradigm  to  evaluate 
the  economic  value  of  analysis.  To  achieve  that  goal  in  Chapter  5,  we 
derived  theoretical  results  in  Chapters  2,  3,  and  4. 

Summary 

The  theorem  of  Chapter  2  introduced  the  key  concept  that  the  value 
of  data  for  a  continuous  quadratic  problem  is  proportional  to  (.he  prior 
covariance  of  the  posterior  means  of  the  state  variables.  We  showed  that 
special  cases  of  the  theorem  are  well  known.  The  constant  of  proportion¬ 
ality  in  the  theorem  contained  only  second  order  parLial  derivatives, 
which  could  be  evaluated  from  closed  loop  sensitivities.  Using  the  idea 
of  compensation  we  derived  a  methodology  which  ranks  the  state  variables 
accurately  for  a  broad  class  of  decision  problems.  The  final  conclusion 
from  Chapter  2  was  that  the  loss  from  deliberate  introduction  of  error  had 
the  same  form  as  the  value  of  data  with  squared  means  replacing  variances. 

The  theorem  of  Chapter  3  gave  the  value  of  data  for  the  risk- 
sensitive  quadratic  problem  using  the  certain  equivalent  approximation. 

To  operationalize  the  result  we  assumed  that  the  state  variables  were 
normal  and  independent.  More  interesting  than  the  theorem  itself  were 
the  conditions  under  which  the  risk- sens i cive  value  of  data  reduced  to 
the  risk- indifferent  value  of  data.  When  these  conditions  held,  we 
found  that  the  risk  aversion  coefficient  could  be  treated  as  a  state 
variable  for  value  of  data  calculations. 
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Chapter  4  considered  the  penalty  for  rough  quantization  of  a  deci¬ 
sion  variable.  We  found  that  although  fine  quantization  was  superior 
to  rough  quantization,  the  expected  loss  from  rough  quantization  was 
negligibly  small.  The  sensitive  parameter  was  sharpness,  a  measure  of 
the  curvature  of  the  evicted  value  over  the  range  of  tne  decision 
variable  - 

Chapter  5  applied  the  results  cf  Chapters  2,  3,  and  4.  The  flow>- 
charts  were  presented  to  help  the  analyst  to  balance  encoding,  risk 
preference,  and  computation  for  future  decision  analyses. 

Further  Research 

The  results  of  this  thesis  can  be  regarded  as  a  black  box.  The 
inputs  are  estimates  of  how  something  might  change  during  a  data  gener¬ 
ating  process  and  the  output  is  a  dollar  valuation  of  the  potential 
change.  The  most  valuable  future  research  would  be  a  series  of  boxes 
which  could  be  attached  to  the  front  of  our  black  box.  The  future 
boxes  would  contain  experimental  and  historical  data.  From  data  that 
is  easy  to  encode,  the  new  boxes  would  generate  the  input  for  our  black 
boxes. 

For  encoding  state  variables,  two  boxes  would  be  useful.  The  in¬ 
put  to  the  first  would  be  elementary  data  such  as  means  and  ranges  of 
state  variables.  The  output  would  divide  the  prior  variance  into  the 
prior  expectation  of  the  posterior  variance  and  the  prior  variance  of 
the  posterior  mean.  Research  comparing  preliminary  estimates  to 
thoroughly  encoded  probability  density  functions  would  be  required  to 
generate  the  box. 

The  second  encoding  box  would  predict  the  variance  of  the  posterior 
mean  as  a  function  of  the  length  of  the  encoding  interview.  Since  the 
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cost  should  also  be  proportional  to  the  length  of  the  interview,  this 
box  would  allow  us  to  compute  the  value  of  partial  encoding.  The  re¬ 
quired  research  would  include  the  design  of  good  encoding  techniques 
as  well  as  the  recording  of  mean  shifts  during  encoding  sessions. 

The  box  required  for  risk  preference  is  similar  to  the  second 
encoding  box.  It  should  predict  the  potential  shift  of  the  risk  aver¬ 
sion  coefficient  as  a  function  of  the  completeness  of  the  encoding. 

Computation  is  probably  the  most  challenging  area.  A  box  is 
needed  for  probability  trees.  It  should  predict  the  errors  as  a  func¬ 
tion  of  the  number  of  terminal  nodes  in  a  tree.  The  obstacle  to  be 
overcome  is  that  the  errors  in  a  probability  tree  are  proportional  high 
order  partial  derivatives  of  the  value  function  with  respect  to  the 
state  variables.  Therefore,  the  quadratic  model  of  the  value  function 
is  not  appropriate. 

The  application  of  the  economics  of  decision  analysis  depends  on 
careful  modeling  so  that  encoding  the  potential  data  does  not  become  a 
burden. 
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